What is Heart Failure?
Heart failure (HF) is a diagnosis made on clinical grounds, requiring at its simplest only a clinical history and physical examination findings, although, of course, certain investigations can help, especially imaging to assess left ventricular (LV) mechanical function. Unlike cancer, or even myocardial infarction (MI), there is no pathological or biochemical test that is either sufficient or necessary to diagnose HF. Natriuretic peptides (B-type natriuretic peptide [BNP] or N-terminal-proBNP) are the closest tests we have to fulfilling this role, and although multiple studies have used elevated NPs as a diagnostic threshold, a guide to therapy or as an inclusion criterion for clinical trial entry, none has become established as an essential diagnostic test for HF.
Two (or Three) Types of Heart Failure
The most recent guidelines of the European Society of Cardiology1 have a simple algorithm for what is needed to guide the diagnosis of HF (see Table 1). Deceptively simple because it actually indicates two separate diagnoses to be made: HF with reduced EF (HF-REF) and HF with preserved EF (HF-PEF). Unhelpfully the table does not indicate the definition of these terms, and only in the text is it revealed that HF-REF refers to those who otherwise fit a HF diagnosis, with, in addition, a LVEF of ≤35 %. Can one assume that this value is always known; is stable and reproducible; is an equivalent between different imaging modalities and methods of calculation; or even that by exclusion that cases with a LVEF >35 % must be cases of HF-PEF as they are not by definition HF-REF? Unfortunately none of these assumptions are valid. Often the EF is not known (even more frequently in general practice) and echo-based, magnetic resonance imaging (MRI) and nuclear estimates may differ in the same patient by as much as 10 percentage points, enough to turn a case from true HF-REF into not a case. Interestingly, the HF guidelines do not actually give recommendations for the treatment of HF; mostly they give recommendations for the treatment of HF-REF and give a very much smaller list of statements about what works in HF-PEF. Brief mention is made about a third group, those with all the symptoms and signs of HF, but whose LVEF is in the range of 35–50 %. They are sometimes referred to as the ‘grey zone’, ‘HF with mild systolic dysfunction’ or HF with intermediate EF (HF-IEF). All this is a problem of our making. If we had stuck with a clinical diagnosis we would have a condition of ‘HF’, which we then would have evaluated in clinical trials. Early on we recognised that this condition presented with multiple pathophysiological and aetiological subtypes, as indeed there are for many types of cancer even of one organ. We could have described these subtypes and tested their responses to therapy separately as subgroups in a larger trial of HF. Had we done this we would have tested drugs that might work in all HF and secondarily assessed relative efficacy in major subtypes, and we then would have found that different EF ranges predicted quantitatively, but unlikely qualitatively, different responses. Had we done this we would not have two (or three) diagnoses merely one diagnosis with a pathophysiological parameter (LVEF) that is later shown to be helpful in determining different relative responses to therapy. We would not have dichotomised HF and left many of our patients understudied and undertreated. It is against this background that we review the diagnosis, epidemiology and treatment of HF-PEF.
HF-PEF is what is left over when LVEFs below 50 % are excluded. It is not a positive diagnosis at all: it is one of exclusion.
As a result of the lack of an established test, the identification, and therefore the treatment of HF, depends ultimately on the willingness or ability of a physician or medical team to call a particular case HF. As historically most cases of HF that have been enrolled in clinical trials or have been assessed for advanced therapies have been of the type with an enlarged left ventricle and poor systolic function (HR-REF), this particular type is often considered ‘real’ HF. This is of no concern where there is a fair degree of consensus about whether an individual case is or is not HF. Take the case of a younger man with a large MI who survives this initial insult and later presents with global poor LV function and fluid retention. This patient is easily recognised as fitting one of the clinical patterns of what we have for decades called the clinical spectrum of HF. Fortunately, this patient matches the inclusion criteria of any number of landmark clinical trials conducted over the period from the late 1980s to the late 2000s when most of our modern accepted HF therapies were first tested.
Compare this situation to a second patient, who is older, female and has a small thick-walled left ventricle but who presents repeatedly to hospital with pulmonary and peripheral oedema who is limited markedly by exertional dyspnoea and who on echocardiography has a small chambered heart with a stiff, poorly compliant ventricle with incoordinate contraction. This patient in all likelihood has an EF of above 40 % or even 50 % and would not have matched the inclusion criteria of many of the landmark HF trials. She may also have a heightened amount of myocardial fibrosis, her diastolic function may be impaired and she may be at risk of atrial arrhythmias and subendocardial myocardial ischaemia due to vasomotor disturbance and endothelial dysfunction in her coronary vasculature. She has HF, her outlook is poor and she consumes a lot of healthcare resources with her recurrent emergency admissions. Yet she would not have been recruited into the landmark HF mortality and morbidity (M+M) randomised controlled trials (RCTs): CONSENSUS, SOLVD, Copernicus, Rales, Merit-HF, CIBIS-II, Ephesus, etc. As a result, we still do not know if she will respond to the treatments we offer our first patient and she is largely left untreated. This means we are failing approximately half of the patients with HF in the community, those who do not have HF-REF and who have been the subject of remarkably few major M+M RCTs.
The Epidemiology of Heart Failure with Preserved Ejection Fraction
Figure 1 shows patients admitted to European hospitals with a diagnosis of HF. As we can see high EFs are just as likely as low, especially in females where they form the majority.
Multiple epidemiological studies have suggested a prevalence of HF in western developed countries of between 1–2 % of the adult population,2 with a steeply increasing prevalence of HF with increasing age. More than 50 % of patients who ever develop HF will do so for the first time over the age of 75 years. Age is also of major importance in predicting the type of HF a patient is likely to present with. HR-REF predominates in younger patients and is most commonly secondary to coronary artery disease. The major RCTs of HF therapy have mainly recruited younger patients, with a mean age of 61 years in all the beta-blocker trials prior to SENIORS, which specifically targeted an older population. This is a decade and half younger than the average age in the community. The older patient, by contrast, is more likely to have hypertension as the predominant aetiology factor, to be female and to have the HF-PEF pattern of LV physiology. There has not been a single mortality and morbidity RCT of HF with an average age of recruits older than 76 years. The mortality of HF-PEF is said by many reports from hospital case series and clinical trials to be lower than that of HF-REF, suggesting it is a condition of lesser importance. In fact in large epidemiological studies in a community setting, or rigidly performed on a sound epidemiological basis, the prognosis of HF-PEF is virtually indistinguishable from that of HF-REF. The most worrying feature is that over the last 15 years only for HF-REF has there been any improvement in the risk of mortality, for HF-PEF it has remained unchanged. This period coincided with one of the most significant advances in the therapy of cardiovascular disease (CVD), the revolution in our treatment of chronic HF (CHF). Consecutively hospitalised decompensated HF patients at Mayo Clinic Hospitals in Olmsted County, Minnesota, US, from 1987 through 2001 show that over this period the proportion with HF-PEF has gone from just below 50 % to more than 50 % and that in contrast to HF-REF there has been no increase in long-term survival.3 See also Figure 2.
More recent reports similarly show outcomes as poor for HF-PEF as for HF-REF.4 The reports that have been said to show much better prognosis of HF-PE compared with HF-REF patients are more commonly series of patients specifically investigated and chosen to enter clinical trials, where other co-morbidities (common in the elderly) are often exclusion criteria. Some reports of this nature suggest that survival is significantly better for HF-PEF compared with HF-REF,5 such as analyses comparing two different clinical trials, such as CHARM Preserved versus the two other CHARM studies, and such analyses also suggest prognostic6,7 and pathophysiological factors may be distinct;8–12 however, these comparisons are biased by the fact that recruitment to trials itself is biased against patients with HF-PEF. That is because trials exclude many patients on the basis of confounding co-morbidities, by the reasoning that co-morbidities confound the trial’s evaluation of a treatment on one condition. But if co-morbidities are, by their natural history, common in a certain disease state then excluding patients with these co-morbidities you are selecting for a very biased and unrepresentative group of patients. This cannot be corrected by analysing ever-larger numbers. We can only compare the outlook and prognosis of HF-PEF and HF-REF by recruiting patients from epidemiologically valid or whole population cohorts, not by analysing selected clinical trial cohorts. The mortality rate of trial HF-PEF patients is lower than that of HF-PEF trial patients because so many of the higher risk HF-PEF patients are excluded to find a ‘purer’ form of HF-PEF. Epidemiologically sound studies find the prognosis of HF-PEF and HF-REF are virtually indistinguishable. Early trials such as the DIG13 trial of digoxin recruited HF patients of both HF-PEF and HF-REF subtypes (sometimes called the DIG-REF trial and the DIG HF-PEF trial) and the effects were similar for the types. Later trials, in the interest of increasing event rates, over-recruited HF-REF and many restricted entry to patients with a LVEF less than 45 %, 40 %, 35 % or even lower (25 % for Copernicus).14 This was done to increase mortality rates, but had the effect of leaving HF-PEF patients unstudied and hence many years later untreated.
Clinical Treatment Trials in Heart Failure with Preserved Ejection Fraction
There have been remarkably few M+M RCTs in HF-PEF. These trials are of two types. In one type, all HF is recruited into a M+M trial and subsets include HF-PEF and HF-REF type patients. The trial is powered to establish its primary efficacy analysis based on the whole trial population then we investigate important subgroups to see if the treatment effect is statistically significantly (or even trending to) different in these subgroups. Occasionally, a subgroup treatment effect may be statistically significant in its own right, but this is not the principal analysis. The best estimate of the treatment effect in a subgroup, if there is no significant effect treatment/subgroup interaction, is that of the whole trial result itself. By this measure if the trial is positive, and if the HF-PEF patients show a similar result and no statistically interaction with treatment, then this is considered evidence the treatment also works in that subgroup, provided of course there are reasonable numbers and not just a handful.
The second type is the standalone trial powered for and recruiting only HF-PEF type patients. In contrast to over 100 such trials in HF-REF there have only been four such trials: CHARM-Preserved, PEP-CHF, I-Preserve and TOPCAT that will be reviewed below.
The DIG trials
The DIG trial was actually two trials, although the second one (DIG-PEF) has been largely forgotten. In what has been called the main trial (that restricted to HF-REF patients) 6,800 HF patients with LVEF of 45 % or less were randomly assigned to digoxin or placebo. The primary outcome of all-cause mortality was unchanged and of the secondary outcomes HF hospitalisation prevention showed a marked effect (26.8 % versus 34.7 %, risk ratio 0.72 [0.66 to 0.79]; p<0.001). The combined endpoint of death from any cause or hospitalisation for worsening HF was significantly lower in the digoxin group (risk ratio, 0.85; 95 % confidence interval [CI] 0.79 to 0.91; p<0.001). The HF-PEF study was smaller (988 patients with LVEF >45 %) and chose the combined endpoint of death or hospitalisation due to worsening HF as the primary outcome. The result of DIG-PEF as they quaintly put it the trial publication was “With
regard to the combined outcome of death or hospitalisation due to worsening HF, the results in the ancillary trial (risk ratio, 0.82; 95 percent confidence interval, 0.63 to 1.07) were consistent with the findings of the main trial.” Thus although being manifestly underpowered, the DIG-PEF trial just missed its primary endpoint statistically. Had the combined DIG trial used this combined endpoint it would have been easily positive for the clinically acceptable combined endpoint of death or HF hospitalisation and the results in HF-REF and HF-PEF would have been indistinguishable.
The SENIORS15,16 trial recruited both types if HF was powered with a single primary endpoint of death or CV hospitalisation. SENIORS in 2,128 HF patients aged ≥70 years showed a 14 % reduction in the primary outcome of all-cause mortality or CV hospital admission (hazard ratio [HR] 0.86, 95 % CI 0.74–0.99; p=0.039). It was a positive trial and LVEF had no impact of the treatment effect with the point estimate of benefit in those patients with a LVEF >35 % being slightly bigger than those with LVEF ≤35 % (see Figure 3). For SENIORS the overall trial was positive and the subset with preserved LVEF did just as well, there was no statistically significant interaction between LVEF and treatment effect yet guidelines fail to recommend nebivolol other than for lower EF. This is even though this is based on an analysis of a subset of the pre-specified question and the authors maintain that because the HF-PEF subset was not independently significant nebivolol cannot be recommended for this cohort. This is despite the fact that the correct statistical analysis is to assume any subset behaves as the whole cohort unless there is a reason or a statistical suggestion that it does not. Nebivolol should therefore be recommended for elderly HF
patients irrespective of LVEF. None of the guidelines follow this logic, and in doing so are themselves illogical.
The Trials that did Recruit Heart Failure with Preserved Ejection Fraction Patients
There have been four M+M trials that have specifically and solely recruited HF-PEF patients: CHARM-Preserved,17 PEP-CHF,18 I-Preserve19 and TOPCAT.20
The CHARM programme actually represents a type of hybrid of the two types of trial mentioned above. The CHARM programme of candesartan is made up of three component trials that were in addition combined together prospectively with a single powered endpoint and recruited and analysed together. It thus could be thought of as a single trial (the programme) with a HF-PEF subset (CHARM-preserved)21 or three trials, one of which CHARM-preserved is in HF-PEF. If analysed the first way the overall programme was negative as the primary endpoint was not reached. Seven thousand five hundred and ninety-nine CHF patients were randomised to candesartan 32 mg or placebo and the primary endpoint of all-cause mortality was not statistically significantly reduced: 23 versus 25 %, HR 0.91, 95 % CI 0.83–1.00; p=0.055. We should therefore not even look at the HF-PEF or HF-REF cohorts for efficacy in these subsets. The CHARM-Preserved trial alone was powered for the composite of CV death or HF hospitalisation. In 3,023 patients, candesartan did not significantly reduce the primary endpoint (unadjusted HR 0.89 [95 % CI 0.77–1.03]; p=0.118), but it came very close (covariate adjusted HR 0.86 [0.74–1.0]; p=0.051).
PEP-CHF was a randomised, double-blind trial, comparing placebo with perindopril, 4 mg/day in patients aged >70 years with a diagnosis of HF, and echocardiographic evidence of diastolic dysfunction and excluding substantial LV systolic dysfunction or valve disease. The primary endpoint was a composite of all-cause mortality and unplanned HF-related hospitalisation: 850 patients were randomised and followed-up for an average of 2.1 years. The power of the study to show a difference in the primary endpoint was reported to be only 35 % (because of poor recruitment and lower than expected event rates) showing only a one-third chance of showing an effect event if a real effect were present. Overall, 107 patients assigned to placebo and 100 assigned to perindopril reached the primary endpoint (HR 0.919, 95 % CI 0.700–1.208; P=0.545). By 1 year, before the extent of loss of adherence to randomised drug groups had become so catastrophically high as mentioned earlier, the reductions in the primary outcome (HR 0.692, 95 % CI 0.474–1.010; p=0.055) and hospitalisation for HF (HR 0.628, 95 % CI 0.408–0.966; p=0.033) were observed and functional class (p=0.030) and 6-minute corridor walk distance (p=0.011) had improved in those assigned to perindopril.
I-Preserve similarly was a randomised double-blind placebo-controlled trial in HF-PEF, but in this case was much larger. Four thousand one hundred and twenty-eight patients 60 years or older and LVEF >45 % were randomised for an average of 49.5 months to 300 mg of irbesartan or placebo. The primary endpoint was death or CV hospitalisation. The primary outcome occurred in 742 patients in the irbesartan group and 763 in the placebo group, giving primary event rates of 100.4 and 105.4 per 1,000 patient-years, respectively (HR 0.95, 95 % CI 0.86 to 1.05; p=0.35). The mortality rates were similar. This result seems disappointing but it is directionally and in scale not dissimilar to the result of VAL-Heft22 of valsartan in HF-REF where in 5,010 patients 160 mg of valsartan reduced the primary mortality/ morbidity endpoint, by 13.2 % (relative risk 0.87, 97.5 % CI 0.77 to 0.97; p=0.009), with no difference in mortality. Also in I-PRESERVE there was a high rate of discontinuation of study treatment (34 % by the end of the study) and a high rate of concomitant use of ACE inhibitors, spironolactone and beta-blockers.
The most recent trial, TOPCAT, built upon earlier smaller trials, investigated another HF-REF-proven treatment.23 TOPCAT randomised 3,445 patients 50 years or older and LVEF >45 % to spironolactone 30 to 45 mg/day or placebo. The trial was not quite positive: the primary composite endpoint was reduced from 20.4 % to 18.6 % (HR 0.89 95 % CI 0.77–1.04; p=0.138) and HF hospitalisations reduced from 14.2 % to 12.0 % (HR 0.83, 95 % CI 0.69-0.99; p=0.04). Yet again a negative trial, but in its pattern not dissimilar to VAL-HEFT. Interestingly in what was both a pre-specified analysis and using a variable that was actually stratified for at randomisation (ensuring the likelihood of good balance between placebo and active) in those patients who qualified for TOPCAT on the basis of an elevated NP level (BNP ≥100 pg/ml or NT-proBNP ≥360 pg/ ml) there was a highly significant 35 % reduction in the primary endpoint. In the elevated NP group there were 78 primary events in 490 patients (15.9 %) compared with 116 events in 491 placebo patients (23.6 %, HR 0.65, 95 CI 0.49–0.87; p=0.003)24 entirely consistent with what has been seen in HF-REF with spironolactone or eplerenone.
The Long-term Effect of Trials that Excluded Heart Failure with Preserved Ejection Fraction
We have seen that the major trials have largely been restricted to HF-REF patients. HF-PEF trials should be able to duplicate these results. This has not happened partly because of restricted funding. Some trials (e.g. I-PRESERVE) have recruited very slowly and have been funded publically rather than by a corporate sponsor where funding is usually more generous. Consider the case of the beta-blocker carvedilol. Carvedilol is now off-patent in most developed countries so further company sponsorship of large expensive trials is unlikely. The sponsors did however pay for three trials, the US Carvedilol program,25 Copernicus and COMET.26 None of these trials included HF-PEF patients. Where resources for trials are limited it seems a tragedy that the third major trial for this agent instead of recruiting the half of all HF that had been totally ignored instead targeted a question of only marginal scientific value, whether carvedilol was superior to a non-proven formulation of another beta-blocker, non-slow-release metoprolol. We cannot, sadly, depend on sponsors studying patient populations of need, they focus where their drug will look best and avoid the more difficult or uncertain areas. If we had recruited patients with HF irrespective of LVEF in a slightly enlarged Copernicus trial and performed subanalyses of HF-PEF and HF-REF we would be in a much stronger position today. It is hard to avoid the conclusion we should investigate27–29 and treat HF-PEF as rigorously as their HF-REF counterparts.
HF is a spectrum of disorders that lead to a single clinical picture. Unfortunately early in the development of effective medication we restricted our attention to only one end of the spectrum, HF-REF, leaving the other conditions lumped together as HF-PEF to go virtually unstudied and untreated for nearly two decades. This lack of evidence for HF-PEF therapies is largely a problem of our own making and we now need to double our efforts to unravel the presentation, pathophysiology and treatment of a condition that remains a major burden and which continues to grow in importance as the population ages.