Heart failure (HF) is a major public health problem, affecting up to 2% of the global population.1,2 HF accounts for significant morbidity and mortality and healthcare costs worldwide.3 Approximately 10% of the elderly population in the western world have HF.2 The prevalence of HF is increasing due to aging of the population and improved treatments for acute cardiovascular (CV) events, despite the efficacy of many therapies for patients with HF with reduced ejection fraction (HFrEF). In the US alone, approximately 1 million new HF cases are diagnosed annually and HF is one of the leading causes of death.4
Evolution of Endpoints in Clinical Trials in Heart Failure
Until the 1970s, treatment options for HF were limited to digitalis and diuretics, with a focus on improvements in symptoms. This was followed by studies with vasodilators that demonstrated improvements in haemodynamics along with symptoms. With the recognition of these haemodynamic benefits, vasodilators were then studied for their effect on mortality in patients with HF.5 The first of these trials, V-HeFT I, was the first major randomised placebo-controlled trial in CV medicine that showed a trend towards mortality reduction with vasodilators.5
However, in the late 1980s the paradigm changed from haemodynamics to neurohormonal blockade with the demonstration of mortality benefit with angiotensin-converting receptor inhibitors in HF patients, the superiority of these agents over vasodilators for survival benefit and their consistent benefit across different stages of HF.6–8 This was followed by large-scale trials in late 1990s showing survival benefit with beta-blockers, mineralocorticoid receptor antagonists and, more recently, angiotensin receptor–neprilysin inhibitors and sodium–glucose cotransporter 2 inhibitors (SGLT2i) in patients with HFrEF.9–14 In most of these trials, the results were concordant in terms of efficacy for improvement in symptoms, functional and exercise capacity, hospitalisations and safety.
Conversely, historically in studies with inotropic agents, despite improvements in haemodynamic profile, symptoms and functional capacity, there was evidence of adverse outcomes with increased mortality.15,16 The risk for increased mortality with inotropic agents culminated in a regulatory pathway that has required the necessity of clinical trials to address mortality independently or combined with other endpoints.17
With the recognition of HF hospitalisations as one of the strongest markers of mortality, disease severity and healthcare burden, the focus on mortality was followed by an emphasis in recent clinical trials in HF on a reduction in HF hospitalisations as a combined endpoint with mortality or CV mortality.12–14,18–20 It was critical for a drug to demonstrate no increase in mortality but, when the combined endpoint of HF hospitalisations and CV or all-cause mortality was reduced, it also was clinically important to clarify whether the benefit was due to a reduction in HF hospitalisations, mortality or both. As such, drugs such as ivabradine or digoxin, which have been shown to reduce HF hospitalisations but not mortality, received a lower class of recommendation (Class II rather than Class I) in practice guidelines for HF.18,21–23
The emphasis on the combined endpoint of CV death and HF hospitalisations has been further enhanced by recent trials with SGLT2i.13,14,24–26 Historically, following the regulatory guidance outlined in 2008 by the Food and Drug Administration (FDA) for new drugs for type 2 diabetes, many large randomised controlled trials have been conducted with the primary goal of assessing the safety of antihyperglycaemic medications on the primary endpoint of major adverse CV events (MACE), defined as CV death, non-fatal MI or non-fatal stroke.27 HF was not specifically mentioned in the FDA guidance.27 However, several trials subsequently showed the strong impact of antihyperglycaemic drugs on HF outcomes, which were not originally specified as the primary endpoint of the trials.28 With the recognition of the consistent risk reduction in HF hospitalisation seen across all trials with SGLT2i in patients with diabetes, new trials have been conducted with SGLT2i in patients with HFrEF with or without diabetes, which again demonstrated the safety and efficacy of these agents in reducing the combined endpoint of CV mortality and HF hospitalisations in patients with HFrEF regardless of the presence of diabetes.13,14 This underscores the importance of HF end-points in all CV trials and that the CV trials should not solely focus on MACE endpoints, which tend to emphasise ischaemic endpoints but not HF events.
Recognising the dynamic changes in the health care delivery models that have resulted in avoidance of hospitalisations and escalation of therapies in the setting of observation units or urgent care, the hospitalisation endpoints have been expanded to include urgent or emergency care or the requirement for intravenous diuretic therapy in addition to hospitalisations for HF. Furthermore, with the expansion of virtual visits, especially in the context of the coronavirus disease 2019 pandemic, other forms of encounters, including home-based therapies or virtual encounters, will likely be included in future endpoints.
HF therapies that theoretically improve congestion and improve haemodynamic changes may also have unintended adverse consequences, such as renal or myocardial injury, that may offset their benefits. This was demonstrated in the ultrafiltration clinical trials. Acute cardiorenal syndrome occurs frequently in patients hospitalised with HF exacerbation and is a predictor of poor outcomes.29 Venous ultrafiltration, due to precise control of the rate and volume of fluid removal and less activation of the neurohormonal axis, was proposed as a potential therapy to improve congestion and treat kidney dysfunction in patients hospitalised with acute decompensated HF and cardiorenal syndrome not responding to medical therapy.30 However, in the CARRESS-HF trial, ultrafiltration, compared with diuretic-based therapy, in patients with acute HF did not demonstrate significant differences in weight loss at 96 hours, 60-day mortality or the rate of hospitalisation, but it significantly worsened serum creatinine.31 Moreover, ultrafiltration was associated with higher rates of adverse events attributed to a higher incidence of kidney failure, bleeding, and intravenous catheter-related complications.32 The endpoints related to devices and interventions have evolved over time. More recent studies entail not only device efficacy and safety endpoints, but also clinical outcomes and patient-reported outcomes, which are addressed in the following section.
However, the strict emphasis on hard endpoints in clinical trials has historically created a predominant focus on mortality and or hospitalisation benefit, with limited recognition of improvements in symptoms, quality of life and functional and exercise capacity, which are critical parameters for patients and shared decision making. Recently, the FDA provided guidance to make it clear that an effect on symptoms or physical function, without a favourable effect on survival or risk of hospitalisation, can, in fact, be a basis for approving therapies to treat HF.33
Although this complemented focus on patient-centric outcomes and quality of life will be an important paradigm change, the approach for regulatory drug approval in the US will likely require a safety signal, with a requirement for no evidence of an increase in mortality or hospitalisations.17
Of course, one needs to keep in mind that hospitalisations and decompensations requiring intravenous interventions are also important endpoints from patient perspectives because they result in poor quality of life. Despite current treatments, rates of hospital admissions and readmissions for HF have shown little improvement during the past three decades, with substantial healthcare costs attributable to HF hospital admissions.2 Implantable systems for chronic monitoring of pulmonary artery pressures (CardioMEMS Heart Sensor) guide haemodynamic-targeted outpatient management of patients with chronic HF and have been shown to result in a significant reduction in hospital admission for HF and to improve quality of life, as assessed by the Minnesota Living with Heart Failure Questionnaire.34–36 Clinical trials with implantable cardiac monitoring systems targeted changes in haemodynamic measurements combined with reductions in HF hospitalisation as endpoints of efficacy and device- and system-related complications as endpoints of safety, providing an example of unique haemodynamic and safety endpoints relevant to device efficacy and safety, combined with clinical endpoints relevant to patients and systems of care, such as quality of life and readmission rates.36,37
Endpoints combining efficacy and safety were also reported in trials with percutaneous valvular interventions in patients with HF. Transcatheter-delivered device therapy known as edge-to-edge leaflet repair (MitraClip) is a promising therapeutic option in patients with HF and severe functional mitral regurgitation. In the COAPT trial, the primary efficacy outcome of HF hospitalisation within 24 months was significantly lower in the MitraClip arm compared to medical therapy (control) group, with no difference in primary safety outcomes of freedom from device-related complications at 12 months.38 Secondary outcomes assessed in the COAPT trial included parameters related to quality of life, including patient-reported changes in the Kansas City Cardiomyopathy Questionnaire (KCCQ) and the 6-minute walk test (6MWT), and echocardiographic parameters (changes in left ventricular end-diastolic volume, mitral regurgitation severity and tricuspid regurgitation).38 In the COAPT trial, although mortality was not the primary endpoint, a prominent finding of the clinical trial was a significantly lower rate of mortality at 1 year.38
A second, smaller, randomised controlled trial assessing percutaneous mitral valve repair also evaluated all-cause mortality and HF hospitalisation but did not show a significant difference in these clinical endpoints between percutaneous repair and medical therapy alone.39 Quality of life and functional capacity in secondary mitral regurgitation are important parameters, and percutaneous mitral valve repair has been shown to positively affect both in prospective registries.40,41 Based on the results of the clinical trials, the FDA approved the use of transcatheter mitral valve repair for functional mitral regurgitation.
The differences in endpoints for devices and drugs are also driven, in part, by the differences in FDA approval processes for the two types of therapies. Although the FDA requires device trials to demonstrate device safety and efficacy, the level of evidence required for approval is often less rigorous than the hard endpoints required for new drug approval.
Data Standards in Cardiovascular Endpoint Definitions
A major limitation in comparing outcomes among trials within and across drug and device development programs has been the lack of uniform definitions for HF and key endpoint events. Attempts have been made to develop definitions that are characterised by objective criteria and reported uniformly, and such definitions have evolved over time.42–45 The standardisation of definitions helps ensure optimal capture of HF events despite differences in the threshold for hospitalisation worldwide and increasing pressure, especially in the US, to reduce the number of HF hospitalisations.
HF events that are not hospitalisations have prognostic significance similar to HF hospitalisations. Because mortality continues to be important for drug or device approval, it is often included as part of the primary endpoint, along with HF hospitalisations and similar events, such as urgent care or emergency department visits, that result in intravenous therapies with diuretics and/or vasoactive agents, which are suggestive of decompensation that may result in hospital visits or therapies, adding to healthcare dollars and potentially affecting patient prognosis. Not all HF events are equal, making comparisons across the different drugs and devices difficult.
Heart Failure Event
The most recent Cardiovascular and Stroke Endpoint Definitions for Clinical Trials, developed by the Standardized Data Collection for Cardiovascular Trials Initiative and the FDA, define hospitalised and non-hospitalised HF events as relevant endpoints in HF trials and trials of non-HF therapies in which the therapy may affect the risk of HF.43 An HF event includes hospitalisations for HF and urgent outpatient visits and is defined as a constellation of signs, symptoms, diagnostic testing and HF-directed therapy, as described in Table 1. It is emphasised that HF hospitalisations should be delineated from urgent visits, and that if urgent visits are included in the HF event endpoint, the number of urgent visits needs to be explicitly presented separately from the number of hospitalisations.43
Heart Failure Hospitalisation
To fulfil the criteria for an HF hospitalisation, a patient is required to have an unscheduled hospital admission for a primary diagnosis of HF with a length of stay that either exceeds 24 h or crosses a calendar day.43 The patient should also have typical signs, symptoms and diagnostic testing results consistent with the diagnosis of HF (Table 1). Objective diagnostic findings supporting the diagnosis of HF include elevated natriuretic peptides, radiological evidence of pulmonary congestion and either echocardiographic or invasive evidence of elevated filling pressures. In addition to these signs and symptoms, the patient should be receiving treatment specifically directed at HF, including initiation of intravenous diuretic or vasoactive agents (e.g. vasodilator, vasopressor or inotropic therapy), or mechanical circulatory support or fluid removal (Table 1).43
Urgent Outpatient Visits
To satisfy the criteria for a non-hospitalised HF event, the patient must have an urgent, unscheduled office or emergency visit for HF with signs, symptoms and diagnostic testing similar to those described for HF hospitalisation (Table 2). The patient must also require therapy similar to that described previously for an HF hospitalisation, including initiation of intravenous diuretic or vasoactive agents (e.g. vasodilator, vasopressor or inotropic therapy), or fluid removal.43 It is important to note that clinic visits for the electively scheduled administration of HF therapies or procedures (e.g. IV diuretics, intravenous vasoactive agents or mechanical fluid removal) do not qualify as non-hospitalised HF events.43
Other than HF events, the clinical endpoints described below are reported as safety or efficacy endpoints in HF clinical trials.
Death is usually reported as an efficacy or safety endpoint in clinical trials. In CV studies, when the specific cause of death is important, adjudication using standardised definitions is recommended.43 The collection of appropriate source documentation is critical for rigorous adjudication of the cause of death. Although death certificates establish that the patient died, reliance on information included in death certificates may be problematic.43 Autopsy reports can be valuable in assessing the cause of death, but may not always be available.46
CV deaths include deaths that result from an acute MI (AMI), sudden cardiac death, death due to HF, death due to stroke, death due to CV procedures, death due to CV haemorrhage and death due to other CV causes.43 Classification of deaths as CV or non-CV is aimed at capturing the primary cause of death.43 The primary cause as defined here is the underlying disease or injury that initiated the train of events resulting in death. Thus, when an AMI leads to a fatal arrhythmia, the primary cause of death would be the AMI.43 The clinical progression toward a fatal outcome is often manifested by multiple intermediate steps, and identifying the primary cause requires careful consideration. The primary cause may be distinct from both the mode of death and an intervening cause that is temporally closer and contributes to the death.43 In patients with HFrEF and New York Heart Association (NYHA) Class II and III HF, approximately 90% of deaths are classified as being due to CV causes and 10% are documented as being due to non-CV causes.47
The mode of death is generally regarded as the physiological derangement or the biochemical disturbance produced by the cause of death and should not be substituted for the primary cause. Non-CV causes of death (e.g. renal failure) often ultimately culminate in a CV mode of death (e.g. arrhythmia) that should not be confused with CV death. In addition, the overlap between the primary cause of death and mode of death can also render the subclassification of CV deaths difficult.43
Heart Failure Death
HF death is defined as a death that occurred as a result of worsening symptoms and/or signs of HF, or intractable HF. The death generally occurs during or following hospitalisation but could occur at home, at a long-term care facility or in hospice care. Terminal arrhythmias associated with HF deaths are usually classified as HF death. HF secondary to a recent MI should be classified as an MI death. Patients with worsening HF usually have symptoms and signs of HF and diagnostic evidence of HF, such as an abnormal chest X-ray and a significant increase in natriuretic peptide concentrations.
When sufficient information is available, HF death can be subcategorised as with or without low output and/or congestion. Low output is usually indicated by fatigue, signs of vasoconstriction, prerenal azotaemia, the need for vasopressors, low cardiac output or hypotension. Congestion is usually indicated by symptoms and signs on physical examination, chest X-ray and non-invasive and invasive measurements.
The device studies, especially percutaneous devices for the management of acute/decompensated HF with cardiogenic shock, also provide a different perspective for endpoints in clinical trials in patients with HF. In refractory circulatory shock, mechanical circulatory support devices, including pulsatile (intra-aortic balloon pump [IABP]), axial continuous (Impella) or centrifugal continuous (TandemHeart) pumps or extracorporeal membrane oxygenation units result in distinct haemodynamic changes and ventricular pressure/volume unloading to improve cardiac output and blood pressure. Unlike drug trials that rely on hard clinical endpoints, most clinical trials studying percutaneous left ventricular assist devices (pLVAD) have relied on demonstrating improvements in specific haemodynamic parameters that the device is designed to achieve, such as cardiac output, arterial pressure, pulmonary capillary wedge pressure, right atrial pressure or systemic vascular resistance.48–51
Given these trials were conducted in shock patients at high risk of mortality, symptoms were not taken into account. However, importantly, because they are highly invasive techniques, procedural complications were considered as endpoints. The measurement of haemodynamic surrogate endpoints in these studies reflected treatment effect that is expected to correlate with clinical benefit. Therefore, surrogate endpoints can be important, as has been the case in certain device trials or small exploratory trials with relatively short follow-up, in which it can be difficult to power for symptom- or survival-based clinical endpoints. This was demonstrated by a meta-analysis of clinical trials comparing pLVAD to IABP showing that although the pLVAD devices significantly improved haemodynamics, neither of the two therapies improved 30-day mortality, likely to be due to a small number of patients in all the trials combined.31
Other Evolving Endpoints
In clinical trials, the approach of time to event analyses of clinical endpoints, such as mortality and HF hospitalisations censors hospitalisations after the initial event, discounting the clinical burden of multiple repeated hospitalisations. Conversely, patients with prolonged index hospital stays have less time at risk of rehospitalisation, and patients who die are not at risk of rehospitalisations.
Days Alive and Out of the Hospital
For interventions without an impact on the initial length of stay (LOS), the composite of death and repeat hospital stay may be a better endpoint. For studies of interventions that may have an effect on the initial LOS, ‘days alive and out of the hospital’, which combines mortality, the LOS of the index hospital stay and the burden of subsequent hospital stays, would be a better endpoint.
Heart Failure Versus All-Cause Hospitalisations
Although HF hospitalisations are of specific interest in drug development in HF, the effect on all-cause hospitalisations would also be of interest, especially for treatment strategies that can affect a variety of comorbidities, such as a medication that may reduce the incidence of AF along with HF events or a glucose-lowering drug that may reduce hospitalisations due to diabetes as well as HF.
Due to the cluster of comorbidities and increased prevalence of all-cause hospitalisations in patients with HF with preserved ejection fraction (HFpEF), all-cause hospitalisations may also be of interest in patients with HFpEF.52 All-cause hospitalisations would also be of interest in device- or intervention-related therapies, especially if the intervention has procedural-related risk or complications that may require hospitalisation related to the intervention. However, it should be kept in mind that statistical power and sensitivity are greatly enhanced by examining the specific categories of hospitalisations, such as HF hospitalisations (Table 1), that one expects treatment to affect rather than including insensitive outcomes (e.g. cancer or stroke hospitalisation that is not expected to be affected). A second problem with chiefly focusing on overall hospitalisations is a loss of power when one only counts the first hospitalisation per patient (e.g. as in a time to event analysis).53–55
On the basis of SOLVD trial data, approximately 38% of hospitalisations for HF occurred after hospitalisation for another cause.8 Therefore, using total hospitalisations leads to a loss in statistical power because of the inclusion of a large number of events that are insensitive and a loss in events that are truly sensitive. In the SOLVD trial, the use of first hospitalisation instead of HF hospitalisation would have substantially reduced power.
Global Ranking Approach
The global ranking approach is a strategy for incorporating multiple aspects of the clinical course, including both events and quantitative measures of functional status (e.g. quality of life assessment, 6MWT or biomarkers of cardiac injury), based on a prespecified hierarchical ranking system and may provide many of the advantages of composite endpoints while avoiding pitfalls.56 The basis for using a prespecified hierarchical ranking system lies in the discrepancies often found between Phase II and III studies, where the Phase II study shows improvement in symptoms or congestion but the positive findings do not translate to positive results when the Phase III study is completed. One possible hypothesis suggested for this is that the improvement in symptoms or congestion occurs at the cost of unintended consequences, such as renal or myocardial injury.57
Biomarkers are commonly used to assess congestion and myocardial injury and include B-type natriuretic peptide, N-terminal pro B-type natriuretic peptide (NT-proBNP) and troponins I and T. Thus, combining biomarkers and clinical endpoints by incorporating continuous data and clinical endpoints, and avoiding the ‘time-to-event’ analyses that are usually used, may be more useful in Phase II studies to provide a better indicator of the success or failure in the Phase III study. A framework that can accomplish this was proposed by O’Brien in 1984.58 In this method, one ranks the endpoints, including both traditional hard endpoints (e.g. mortality) and surrogate endpoints (e.g. biomarkers), as well as subjective endpoints (e.g. dyspnoea). An example of such a global rank list may rank all patients accordingly, with worst outcomes having the lowest score, such that the patient with least time to death would have the lowest score and the patient who avoided death, hospitalisation and had the best improvement in dyspnoea with little myocardial injury (lowest troponin) and no worsening of renal function and best reduction in NT-proBNP would have the highest score.57
This type of global ranking was used in the FIGHT study.59 The primary outcome measures were the global ranking of predefined events from randomisation to 180 days, including time to death, time to hospitalisation and time-averaged proportional change in NT-proBNP. Patients were assigned scores with the shortest time to endpoint or least proportional change to get the lower scores. Secondary outcomes meant to be exploratory included echocardiographic indices, functional assessment and the quality of life score, determined using the KCCQ. This type of endpoint assessment is more global and thought to be more useful in Phase II studies and may provide insights into planning for a Phase III study.
Composite endpoints (e.g. the frequently used ‘death or HF hospitalisation’ endpoint) typically treat all components of the composite equally, despite the fact that clinicians and patients may value specific components of the composite very differently (e.g. death versus hospitalisation). Because non-fatal events tend to occur more frequently than deaths, less severe outcomes (e.g. hospitalisations) tend to drive composite endpoints to a greater degree than less common but more serious outcomes (e.g. death).
Although composite endpoints may provide higher event rates, it may be difficult to interpret whether drug or device effects are similar for all components or whether the effect of treatment is primarily on a more common, less serious component of the composite. From a clinical perspective, composite endpoints reflect the fact that the totality of patient experience with a given therapy may not be captured by mortality alone.
Most long-term studies use ‘time-to-event’ methods, in which patients are followed up until the first ‘event’ of the composite endpoint. This potentially introduces major problems in interpretation, in that less severe events happening earlier in the study (e.g. a brief HF hospitalisation) are counted, whereas more severe events (e.g. death) that happen after an initial event would be censored in the primary analysis of the trial outcomes.56 As an example of this potential discrepancy, using a standard chronic HF composite endpoint of time to death or first HF hospitalisation, a patient who is hospitalised for HF 2 weeks after randomisation but then survives and feels well for 5 years would be viewed as having a worse outcome than one who dies 2 months after randomisation. In this sense, the composite endpoint weighs the clinical course in a way that is incongruous with the way it would be viewed by patients and providers.
To overcome this problem, the concept of the win ratio and Finkelstein–Schoenfeld method for reporting composite endpoints has been introduced, the where different components for the composite endpoint are assigned different levels of importance.56 With the Finkelstein–Schoenfeld or the win ratio method, pairwise comparisons are performed and the scores are calculated based on the comparison of the importance of the outcome.60 Consider a primary composite endpoint, such as CV death and HF hospitalisation in an HF trial. Matched pairs of patients are made from the new treatment and control groups based on risk profiles, with patients in the new treatment and matched control groups labelled ‘winners’ or ‘losers’ depending on whoever has a CV death first. If that is not known, only then are they labelled a ‘winners’ or ‘losers’ depending on who had an HF hospitalisation first. Otherwise, they are both considered tied. The win ratio is the total number of winners divided by the total numbers of losers; 95% CIs and p-values for the win ratio are then obtained.
The composite endpoint may actually serve to ‘dilute’ the observed treatment effect and thereby diminish statistical power to detect a difference between treatments if some of the components are affected in a different direction or unaffected altogether, despite an increase in the overall event rate.56
Furthermore, components of the composite may move in different directions, in a divergent manner. It is critical for safety measures, such as an increase in mortality, not to be diluted or masked by improvement in morbidity or hospitalisations in a combined endpoint When composite endpoints are used, data collection for all components should continue until the end of the trial so that each component can be evaluated separately.61
Clinical Status Endpoints
Although mortality is a traditional endpoint for drugs or devices to be approved, a patient-centric approach would argue that the quantity of life lived may not be as meaningful if the patient experiences a poor clinical status, reduced functional capacity and poor quality of life.62 A patient may instead prefer a neutral effect or even a small negative effect on mortality while enjoying an improved quality of life and functionality.63,64 To address this, clinical status composite endpoints have been used in some HF trials. However, clinical status assessments are challenging.
Inherent to intra- and interobserver variation, the reporting of subjective symptoms concerning a specific type (e.g. shortness of breath or fatigue) and type of provocation (at rest versus exertion and amount of exertion) has shown to be problematic and not useful in discerning treatment effect. Similarly, NYHA class has an abundant amount of subjectivity and can be affected by non-HF conditions, such as chronic obstructive pulmonary disease and arthritis. In addition, the physician’s method of categorising the various classes may be different from others because even the definitions of NYHA Classes I–IV are rather subjective. Then, there is global assessment, which is generally performed by the patient without physician input to avoid bias.
In addition, objective assessments of functional capacity through exercise testing have been used to evaluate the ability of the treatment to prolong exercise. However, issues with patient motivation, subjective encouragement introduced by the person administering the test, intra- or interobserver variability and improved performance with repeated testing are problematic. In addition, a patient’s performance may vary and the standalone exercise assessment may not reflect the general condition or exercise capacity of the patient.65 Finally, there are quality of life assessments that incorporate a range of physical activity, as well as emotional, functional and cognitive, impairments via questionnaires. There are HF-specific questionnaires that are commonly used, including the KCCQ and Minnesota Living with Heart Failure Questionnaire. A combination score incorporating different components of functional status, such as NYHA functional class and global assessment, may be useful.65
Combined Clinical Composite Score
The combined clinical composite score approach combines changes in clinical hard endpoints, such as mortality and hospitalisations, with NYHA functional class and a global assessment. In addition to clinical events, these assessments may also include symptom resolution and biomarker changes.66 The combined clinical composite scores are used to allow for smaller sample size and provide a comprehensive assessment of the trial result. As mentioned previously, when combined endpoints are analysed statistically and the time to the first event is used, subsequent episodes of clinical deterioration may be ignored during statistical analysis. The commonly used clinical composite score combines changes in NYHA functional class and global assessment together with the occurrence of major clinical events. For regulatory purposes, the endpoint used in major clinical trials must be clinically meaningful and must represent a direct assessment of present or future clinical status. Thus, symptoms and functional capacity are commonly used for clinical status, whereas death or hospitalisation are used for major clinical events. In general, clinical investigators have used clinical status for short- and intermediate-term trials and hard events like death and hospitalisation for long-term trials. However, even in short or intermediate trials, mortality and morbidity data are still included to demonstrate safety. A clinical composite score minimises the exclusion of randomised patients who deteriorated and withdrew due to worsening symptoms.65
Current Applications of Endpoints
When reviewing contemporary clinical trials, it remains clear that the primary endpoints continue to incorporate mortality and HF hospitalisations either in combination or use mortality as the primary endpoint and HF hospitalisations and other MACE events as secondary endpoints. For example, in the PARADIGM trial, the primary outcome was a composite of death from CV causes or first hospitalisation for HF.12 Secondary outcomes were the time to death from any cause, the change from baseline to 8 months in the clinical summary score on the KCCQ, the time to new onset of AF and the time to the first occurrence of a decline in renal function.12
The ADHF RELAX study, examining the efficacy and safety of serelaxin in acutely decompensated HF patients, had two primary efficacy endpoints: death from CV causes at 180 days and worsening HF at 5 days.67 Of note, worsening HF was added to the primary endpoint mid-trial and was originally a secondary endpoint. Worsening HF was defined as worsening signs or symptoms of HF that led to an intensification of treatment for HF, such as initiation or an increased dose of intravenous therapy with a diuretic, nitrate or other medication for HF or the institution of mechanical support, such as mechanical ventilation, ultrafiltration, haemodialysis, an IABP or a ventricular-assist device. The endpoint of worsening HF also included death from any cause or rehospitalisation for HF among patients who had been discharged before Day 5. Secondary efficacy endpoints included death from any cause at 180 days, the index hospital LOS and death from CV causes or rehospitalisation for HF or renal failure at 180 days.67 As one can see, the primary endpoint became diluted with the worsening HF, which was broadly defined, although this did not appear to have altered trial results because all endpoints appeared neutral with regard to drug effect.
Thus, both these chronic and acute HF trials incorporated mortality, either all-cause and CV or both, as well as HF events. In ADHF RELAX, they did not include other endpoints, such as global assessment or quality of life scores, or haemodynamic data despite it being a study in acutely decompensated HF patients.67
In chronic HF, it is important to know whether the treatment prevents multiple events. Recurrent event analyses to determine the treatment effect on recurrent events, such as HF hospitalisations, would be relevant given hospitalisations for HF are the major contributor to healthcare costs. When only the first event or the time to first events is recorded, the patient’s overall burden of disease may not be accurately represented. More contemporary trials appear to accept the importance of this analysis. In the PARADIGM-HF study, evaluating sacubitril/valsartan versus valsartan, the primary endpoint was a composite event of CV death and total (first and recurrent) HF hospitalisations.12 There was also adequate power in this study to enable standard time-to-first-event analysis.61
Another type of analysis is responder analysis, whereby endpoints such as symptoms, functional status, exercise capacity, quality of life measures and haemodynamics can be evaluated based on the clinical relevance of the change as determined by expert consensus.68 This may be helpful when designing patient-centric studies for mainly symptom relief with perhaps a neutral effect on hard endpoints.
Future Directions and Challenges
The appropriate selection of the right endpoints is critical in HF clinical trials to allow the development and approval of therapies with meaningful outcomes for patients and clinicians. Currently, clinical trials predominantly rely on efficacy endpoints reflecting total and/or cause-specific mortality and morbidity. These endpoints are considered to be scientifically reliable and robust due to our ability to measure objectively with standardised definition, accuracy and reproducibility, with minimal bias or confounding. However, endpoints must be clinically relevant to both patients and healthcare providers.
Depending on a patient’s perception of their overall HF symptoms and severity of illness, particularly sicker, severely limited or hospitalised patients may choose to trade quality over quantity of life (i.e. a drug therapy that improves symptoms, function and quality of life without a significant effect on survival or even a potential to reduce survival).69 Despite widespread recognition that addressing symptoms, functional capacity and quality of life are important therapeutic goals in HF management, few drugs are currently approved for symptom relief in acute or chronic HF.
Endpoints must be tailored to meet the needs of the population under study. Therefore, patient-reported outcomes alone or in combination with measures of functional status may be more relevant to patients, especially those with more advanced stages of the disease. The choice of an endpoint is further influenced by the characteristics of the target patient population, patient phenotype (e.g. HFrEF versus HFpEF), episodes of care (e.g. acute versus chronic HF), stages of HF and treatment objectives (e.g. reductions in morbidity and mortality, safety, symptom management, improvement in haemodynamics) to discriminate between effective and ineffective therapies. A balanced focus on developing therapies that help patients live longer and improve symptoms and quality of life is crucial. Clinical trials must attempt to address the goals of patients and clinicians while addressing the requirements of the regulatory agencies and sponsors.
The FDA has recently issued guidance stating that improvement in patient-centric outcomes that measure a patient’s perception of health status (symptoms, functional status, physical function, quality of life), even without demonstration of favourable effects on survival and hospitalisations, can be the basis for approving therapies in development for HF. This guidance is important because it will encourage the development of HF therapies that address the totality of endpoints and meet evolving patient needs. The regulatory approval of drugs for symptom-based indications will allow coverage by third-party payers and improve access to drugs among vulnerable and sicker patients. In addition, although proof of improved survival or morbidity will not be required for approval, there will be consideration of the safety and mortality of these therapies, and studies will still be powered to reasonably rule out an adverse effect on mortality.
Robust methods for capturing HF outcomes other than hospitalisation or death must be developed with strategies to reduce variability and improve precision in adjudication. For example, dyspnoea is an important outcome in HF clinical trials. However, consistent measures or standardised instruments for the assessment of dyspnoea need to be developed, validated and adapted consistently across HF research. Furthermore, longitudinal change in dyspnoea over time provides more information than a single point measurement, but the development of a simple instrument sensitive to changes in health status (e.g. dyspnoea) is necessary in order to integrate the change in the severity of dyspnoea over time as an endpoint.
Similarly, the development of validated and standardised patient-reported outcome instruments, especially self-administered when possible, will make these instruments acceptable for the basis for drug approval. Continued improvement in the methodology of HF clinical trials will favourably influence the future direction of HF research and, ultimately, patient outcomes. Finally, studies should be powered to capture mortality in the clinical trial, even if it is not a primary or efficacy endpoint, to establish safety margins. Future collaborative efforts require all stakeholders, including physicians, sponsors, industry, regulatory bodies and insurance companies, to focus on strategies and clinical trial designs to address these unmet needs in HF therapy trials and ultimately improve patient outcomes (Figure 1).
In this review, we summarised the evolution of endpoints used for HF therapies. Currently, large pivotal HF trials rely on demonstrating improvements in hard endpoints, including HF hospitalisation and mortality. In recognition of the fact that the dynamic changes in the health care delivery models have resulted in an avoidance of hospitalisations, the hospitalisation endpoints have been expanded to include urgent or emergency care or the need for IV diuretic therapy.
Most long-term drug studies use ‘time-to-event’ methods and composite endpoints. Conventionally, for composite endpoints, all individual components are weighted equally, which is not consistent with real-world practice, where patients and clinicians may value specific components of the composite very differently (e.g. death versus hospitalisation). Because non-fatal events tend to occur more frequently than deaths, less severe outcomes (e.g. hospitalisations) tend to drive composite endpoints to a greater degree than less common but more serious outcomes (e.g. death). Therefore, methods for weighting the relative importance of the individual components must be improved. It is critical for safety measures, such as an increase in mortality, not to be diluted or masked by improvement in morbidity or hospitalisations in the combined endpoint.
HF patients experience a high burden of symptoms and functional limitations; therefore, patient-reported outcomes, quality of life and functional capacity are critical parameters for patients and shared decision making. In line with this is a recent paradigm change in regulatory guidance from the FDA allowing the use of measures of functional status or quality of life for regulatory approval in HF trials. Future collaborative and timely efforts are required to provide evidence for CV therapies that are effective, safe and meaningful for patients at different stages of HF.