Article Text

Download PDFPDF

Wickedness or folly? The ethics of NICE’s decisions
  1. K Claxton1,
  2. A J Culyer2
  1. 1Centre for Health Economics, University of York, Yorkshire, England
  2. 2Institute for Work & Health, Toronto, Canada
  1. Correspondence to:
 A J Culyer
 Institute for Work & Health, 481 University Avenue, Toronto, Ontario, Canada M5G 1T4; aculyer{at}iwh.on.ca

Abstract

A rebuttal is provided to each of the arguments adduced by John Harris, an Editor-in-Chief of the Journal of Medical Ethics, in two editorials in the journal in support of the view that National Institute for Health and Clinical Excellence’s procedures and methods for making recommendations about healthcare procedures for use in the National Health Service in England and Wales are the product of “wickedness or folly or more likely both”, “ethically illiterate as well as socially divisive”, responsible for the “perversion of science as well as of morality” and are “contrary to basic morality and contrary to human rights”.

  • NHS, National Health Service
  • NICE, National Institute for Clinical Excellence
  • QALY, quality-adjusted life year

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

John Harris describes a recent provisional recommendation by the National Institute for Health and Clinical Excellence (NICE)1 as “wickedness or folly or more likely both”.2 iThis is not the language we expect in an academic journal, nor indeed if the intent were to nurture a forum for reasoned debate. The sentiments he expresses, however, do reflect some common populist objections to NICE recommendations that are often reported in the media. Similar objections have been raised by specific interest groups of patients, healthcare professionals and manufacturers. For this reason, identifying the root of the moral turpitude, if that is indeed what it is, ascribed by Harris to NICE is of wider and more general interest. As two people who had some degree of responsibility for framing the guidance used by NICE in making recommendations about which technologies the National Health Service (NHS) ought to adopt, and having been party to several controversial NICE decisions (including the provisional judgement regarding drugs for Alzheimer’s disease), we also evidently bear some personal moral responsibility. We seek to pick apart the turpitudinous possibilities from the recent exchanges between Harris and NICE’s senior officers, and, in doing so, locate them more precisely or banish them.

Harris’ objections seem numerous, but they actually turn on only a few basic confusions and mutual contradictions. Although both of the Harris editorials couch these criticisms in terms of objections to a particular measure of benefit recommended as the “reference case” by NICE (the quality-adjusted life year (QALY)), they seem on further investigation to lie deeper. Indeed, we conjecture that the editorials are not at root a critique of QALYs as a measure of health benefit at all, but a denial of the allocation problem in healthcare and a further denial of the fundamental proposition underlying NICE’s evaluative philosophy that the ultimate cost of providing healthcare for any group of patients is the health benefits or the opportunities to benefit (whether measured by QALYs or any other means), which are necessarily thereby denied to other groups of patients.

The core of Harris’s objection seems to lie in several remarks, which follow as quotes. The first of these appears in the originating editorial:


 Why are these patients not cost-effective to treat? The only answer must be that they are not worth helping. If it was simply on the basis of cost alone, if we simply cannot afford these drugs, that would be one thing, but NICE has said the drugs are not cost-effective in QALY terms. That means that the amount of better life expectancy they provide to these patients is not worth having for society. It is certainly worth having for the patients and those who care for them …

AFFORDABLE OR COST EFFECTIVE

We are not clear that describing a drug as one “we simply cannot afford” is really different or more acceptable than saying it is “not cost effective”. A possible, but surely trivial, alternative meaning would be to mean by “unaffordable” that one course of treatment for one patient costs more than the entire NHS budget. Excluding this unlikely interpretation, it seems to us natural to suppose that “unaffordable” means no more than that the costs exceed the benefits—that is, we choose to buy something else instead of the thing in question. In the context of NICE and the NHS, this means that the estimated health benefits that could be gained from the technology are less than those estimated to be forgone by other patients, as other procedures are necessarily curtailed or not undertaken. It is this comparison of health gained and health forgone that is at the heart of the rationale of cost-effectiveness analysis and the debate between Harris and the senior officers of NICE. It has nothing specifically to do with QALYs. It would remain unavoidable even if the currency of advantage offered by health care were expressed in terms of providing opportunities to benefit, as suggested by Harris, or any other measure of the good done by health care.

The answer to Harris’s (rhetorical?) question is that the patients with Alzheimer’s disease are (probably) not cost effective to treat with these drugs, because other patients would (probably) get greater benefits from the use of the resources spent by the NHS to acquire the drugs to treat Alzheimer’s disease. (It has never been claimed that the drugs for Alzheimer’s disease offer any improvement in life expectancy; the benefits reside in improved quality of life.) To put it in a brutal fashion after Harris’s style, “the drugs are not worth it”, not “the patients are not worth it”. The worth of the patients is not in question. To describe a procedure as not sufficiently worthwhile is not synonymous with the statement that these patients are “not worth helping”; it is simply the inexorable consequence of the principle (which NICE espouses) of using resources in the most effective known ways to promote people’s health. Nor is this the same as saying “the amount of better life expectancy they provide to these patients is not worth having for society”; rather, it is saying that all patients are to be counted as members of the society that NICE seeks to serve (although this may not be the same society as that to which Harris refers—NICE’s guidance is quite clear on the composition of “society”; Harris’s view is not so). But serving the whole of society requires NICE to take account of the alternative uses to which resources, including drug budgets, can be put. To use a budget to extract the last ounce of benefit for one patient group no matter what benefits were thereby denied to other patients being served by the same budget can hardly be considered to serve the needs of the whole of society. So, Harris may retort, “increase the drugs budget”, to which the further retort is, of course, “at the expense of what other health benefit for which other patients?” The retort to this may be, “increase the NHS budget”. Again, “at the expense of what other benefit to what other group in society?” (Lest we are misinterpreted, we should remind the reader that NICE does not set the NHS budget, even incrementally, and, to our knowledge, no one has suggested that it would in any way be proper for it to do so.) These are not rhetorical questions and it is the very purpose of cost effectiveness and related methods of analysis to try to provide at least partially quantified answers to questions such as these to inform those who have the difficult job of making—and being held accountable for—such decisions.


 NICE cannot claim it cannot fund these treatments because that would mean depriving other more deserving or needy patients of treatment because NICE does not review all other options.

DO UNIDENTIFIED PATIENTS MATTER?

NICE does not and cannot evaluate all possible uses of healthcare resources at any one time and generally cannot know which NHS activities will be displaced or which groups of patients will have to forgo health benefits. Harris is certainly correct about this. But what may be inferred from this? Again, what he is arguing is not clear. The two obvious possibilities are as follows:

  • There will be no real costs because other activities will not be displaced and health benefits will not be forgone.

  • Because the people bearing the cost are unidentified and unknown, these health benefits or lost opportunities to benefit are less important or of no consequence compared with the groups of patients under consideration who may benefit from treatment.

The first of the above is absurd, implying as it does that anything can be had at no real cost—at the limit one could even cut the NHS budget and have no effect at all on health benefits. The second rests on a fairly well-documented instinctive and emotional reaction towards identifiable people (a bit like the so-called “rule” of rescue). Such a populist sentiment, however, would be a strange basis for an ethical approach to healthcare policy. The NHS is supposed to serve everyone; so we know that everyone is to “count” in a fundamental sense. The problem is that who is known or unknown to be affected in any particular instance is a matter of time and ignorance. We know that, with enough information, or simply with sufficient time, those currently unidentified and regarded as unknown could become known and ought then to be valued in the same way as the others who are currently identified and known.3 Not to count the welfare consequences for people unknown seems to be a form of arbitrary discrimination to which Harris ought to object deeply—and rightly so. Ethical social decision making should reflect a broader view than that of the immediate and identifiable beneficiaries, not a sentiment born out of a myopic, narrow private perspective based on ignorance. Despite this, Harris suggests that the provisional recommendation to reject Alzheimer’s drugs will “…have very bad consequences for thousands of patients and good consequences for none.

Harris ascribes no social value to the health benefits or opportunities offered by other NHS activities, which will be undertaken if the Alzheimer’s drugs are not purchased, probably because they accrue to unidentified and unknown patients.

A bias exists in NICE’s procedures, although the bias is not considered by Harris and the foregoing suggests that he is blind to its existence. The bias is that it is only those interests that are linked to identifiable patient groups which may benefit from the procedures under consideration that are represented in NICE’s decision-making processes and that have rights of consultation. Similarly, only those commercial and professional interests that are directly affected are included and consulted. Any one of these loses in some way if the procedure is rejected or its use is restricted. Those patients who will bear the true cost of any decision remain unidentified. No commercial, patient or professional lobbies represent them. Thankfully, the Institute and its advisory committees have adopted the ethical position that all health benefits and opportunities matter whether they accrue to identified or unidentified people. The NICE Appraisal Committee thus stands in trust to the many who are not represented and must compensate for the tendency for the interests of the known and identified to be over-represented.

EVALUATE PROCEDURES, NOT PATIENTS


 NICE should not be in the business of evaluating patients rather than treatments; to do so is contrary to basic morality and contrary to human rights.

At various points throughout both editorials, Harris attaches moral significance to a distinction between evaluating treatments and evaluating patients. Such a distinction indeed exists and, as we have indicated earlier, the methods of cost-effectiveness analysis do not evaluate patients (nor assess their worth—whatever that might mean in moral discourse), but rather they evaluate treatments (we prefer the more embracing term “procedures” on the grounds that NICE evaluates more than treatments). This is also, of course, exactly what clinical epidemiology does. Now, it goes without saying that all healthcare procedures exist for the care of patients with particular indications and characteristics. It is trivial to point out that one drug may be safe and effective for one group of patients but either completely ineffective or positively dangerous for another (this is part of assessing its worth). This is why clinical trials to establish safety and efficacy are conducted on particular patients, and licences are granted for use on particular patients and not for others. Procedures can be evaluated only when they are used for particular patients. So, inevitably, we compare the worth of alternative procedures in terms of the consequences their use has for patients. Patient welfare is the ultimate purpose of it all. But this is not the same as evaluating the worth of patients. At most it can be said to be evaluating their capacity to benefit from the use of the procedures and—one of the ultimately difficult tasks—evaluating one group’s ability to benefit compared with that of another. But this too is not evaluating the worth (this time relative) of different patients.

Harris contradicts himself in simultaneously holding to the above-quoted proposition in his first editorial and then suggesting that health care ought to be allocated such that all have equal opportunities of benefiting. To do this would certainly require an assessment of which patients can benefit from what procedures and which patients cannot or would be positively harmed. This is also apparent in his suggestion that in vitro fertilisation should be made available to all who have a significant chance of pregnancy. This would require someone to specify what is meant by “significant” and then to identify the groups of patients that met this criterion and those that did not—precisely the type of evaluation of procedures applied to particular patients that Harris describes as contrary to basic morality and human rights.

THE GOOD OF HEALTHCARE

The discussion to this point is independent of the means chosen to specify the good of healthcare. Whether that good is regarded as a health gain (as measured by one of the various forms of QALY or in some other way), or as opportunities of benefiting or as some other unspecified measure, is largely independent of the characterisation of the fundamental allocation problem in healthcare. We now turn to the specific matter of the QALY. To make difficult allocation judgements, it is evident that many issues have to be taken into account. Two are of particular significance here. One is the question of health gain as a measure of the good and the suitability of QALYs as a pragmatic measure; the other is the question of how NICE ought to make interpersonal judgements.

On QALYs, Harris, at least initially, appears adamant: “NICE has adopted the ubiquitous, but justly infamous QALY, the Quality Adjusted Life Year.”

This is not quite right in two respects. Firstly, NICE does indeed recommend the use of QALYs in the “reference case” (the principal reasons for which are that it is generic and not procedure-specific, its strengths and weaknesses are better understood than those of the alternatives, and that it has been designed to be easily and therefore inexpensively applied in research), but it also recommends departures from the standard assumptions underlying QALYs when they are considered to be inappropriate in a particular case. Secondly, we do not perceive much empirical basis for the factual assertion that the QALY is “infamous” or for the value judgement that this alleged reputation for infamy is “just”.

A large literature is available on QALYs, including the specific form most commonly existing in Europe (the soi-disant EuroQol), none of which is referred to by Harris (nor is it much referred to in any of the inflated list of self-citations contained in his reply to Rawlins and Dillon).4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19 This literature has, over several years, comprehensively picked apart the assumptions needed for a simple form of generic outcome measure that could be routinely used in research. It is, of course, a matter of judgement as to how far we ought to compromise on an “ideal” way of characterising the quality of life expectation, although we considered it to be an uncontroversial matter to seek to make some adjustment to mere quantity of life to reflect its differential quality, and to do this in a way that reflected the general (NHS) public’s own views of what matters. Hence, for all the compromise, NICE’s use of QALYs embodies representative value judgements of the UK population derived from empirical research sponsored by the Department of Health. The use of QALYs by the NICE Appraisals Committee is always accompanied by direct investigations with affected (known) patient groups to detect possible misleading biases. But the measure certainly remains imperfect and, although this is true, it is not particularly more true of QALYs than of any other empirical measure. One has simply to make a judgement on whether it is good enough for the purposes at hand. So far as we are aware, the NICE Appraisals Committee is extremely well versed in the structure of QALY calculation and in its strengths and weaknesses. We are also convinced that the Committee has found QALYs to be, in general, a useful concept around which to organise some of their thinking. One reason for this is that the QALY methods have been extremely valuable in locating the critical components (sometimes called dimensions) in terms of which a generic health measure may be characterised, how they can be scaled, combined with one another at any point in time and over successive periods, and how finely they may detect changes in health in response to alternative forms of treatment (so-called construct validity). It was this research that identified the critical value judgements that were inherent in any concept of health. The main difference between QALYs and kindred indices and other measures lay in the explicitness with which the method identified the necessity of making value judgements and enabled them to be debated.

There seems to us nothing remotely “infamous” about this. Indeed, the participation of distinguished moral philosophers like Norman Daniels and Dan Brock in the development of QALY methods20,21 gives us further reason for confidence that, for all that the QALY is not ideal, it is not a “wicked” instrument. Of course, Harris may merely be indicating that he would make different value judgements from most other people who have considered the matter, a difference to which he is entirely entitled and which he can choose to promote in the editorial pages of the distinguished journal he edits—although he is not entitled to elevate his preferences to the universal and he does have a duty to examine properly the implications of the measure of the good of health care that he would prefer.

NO ALTERNATIVE TREATMENT


 There are two ways in which QALYs can be used. They might be used to determine which of rival therapies to give to a particular patient or which procedure to use to treat a particular condition; in short which of two different treatments is the more cost effective, better for patients, better for society. However QALYs are also used to determine not which of rival treatments to give a particular patient or group of patients, but whether or not to offer any treatment at all to some patients, or whether to offer a particular treatment to some patients, even when no alternatives are preferred.

We find this statement confused and in contradiction with his earlier statement about the infamy of QALYs. Indeed, the first part of the statement seems to affirm that measures of health gain such as QALYs are an appropriate way to inform the decisions that the Institute and its advisory committees face when developing guidance about the use of a procedure for particular groups of patients. For example, suppose we have two alternatives A and B, of which B is more costly but more effective than A. To determine which of the rivals to recommend, we compare the benefits of B rather than A, calculate the expected QALYs gained and compare these with the additional cost of B compared with that of A. Whether the cost per QALY gained from choosing B rather than A is worthwhile depends on the health benefits which will be forgone as other activities that benefit other patients are displaced by the additional cost of B. To choose between alternative ways of treating a particular patient or group of patients necessarily requires a comparison of the health benefits with the health costs to others. If health gain as measured by QALYs is an appropriate measure of the good of healthcare for a particular patient group, then QALYs must also be appropriate for measuring the health forgone by others (who are simply particular patients in another context).

In the second part of the statement, Harris asserts that this procedure is inappropriate when one alternative under consideration is not to offer a disease-modifying treatment. For example, if we were to interpret the low-cost less-effective alternative A as “no treatment” or “best supportive care”, then Harris would not accept the comparison of the health gain offered by B with the opportunity costs as an appropriate way of deciding whether treatment B was worthwhile. It appears that what Harris objects to is not the use of health gain as a measure of the good of healthcare, or even to QALYs as a measure of that health gain, but to the consistent application of these principles when one of the alternatives under consideration is not to offer disease-modifying treatment, but (say) best supportive care. The ethical foundations for this are hard to discern. Our conjecture (we are given no rationale) is that it rests on an implicit preference for biological disease modification over health gain. It suggests that, “concern, respect and protection of the community” can be treated as though they are synonymous with sponsored technologies which claim biological effects on disease processes.

This fetishisation of technology is unfortunate, as there are many circumstances in which best supportive care (or, indeed, something altogether other than health care) would have a greater effect on health-related quality of life, and offer greater concern, respect and protection, and be less harmful than a pharmaceutical agent or other manufactured technology. For example, to take yet another controversial NICE decision, it is perfectly plausible that £7500 per patient per year spent on home helps or other supportive or palliative services would have a greater effect on the health-related quality of life of someone with multiple sclerosis than spending the same money on a technology such as βinterferon, which claims disease modification.

It seems to us extremely doubtful that there exists an ethical justification for this fetishisation of health technologies and that, even if there were, it is far from clear what the principles are that ought to be applied when comparing B with best supportive care. Can it mean that everyone must get some of every technology? Or have an equal chance of getting it? Presumably, the technology must offer some benefit or opportunity to benefit. But how much benefit would be required, how relevant ought that chance of benefit to be and, most importantly, how much health gain ought to be sacrificed in other unidentified patients to satisfy the fetish?

INTERPERSONAL COMPARISONS

A more charitable interpretation of what we have dubbed as “fetish” might be to infer that Harris is objecting to interpersonal comparisons when the nature of the claim on resources that differentiates one claimant from another is the absence of any alternative (apart, we presume, from palliative and other care directed essentially at symptomatic relief). This interpretation can be seen to be one of a class of types of interpersonal comparison that the QALY agenda has raised. This class includes comparisons between beneficiaries who are similar in all respects save that one is currently more ill than the other, or has been more ill for a longer time, or is more multiply deprived than the other, or has a shorter expected life span than the other—all independent of the character of the disease in question. The QALY methodology offers no solution to the way in which these comparisons ought to be made; its virtue in such matters is confined to drawing attention to an issue that needs resolution. The comparisons are not in any way specific to QALYs, but are those that arise in virtually all comparisons of future health, whether measured by QALYs or in some other way. The QALY methodology, however, was the first we are aware of that explicitly identified issues of this sort, which we count as an ethical benefit of the technique.

The best way of handling such matters, once they have been identified and evidence on them is gathered and assessed, seems to be by a deliberative process. This, of course, is what NICE has done. On some matters it has consulted its Citizens’ Council, which in another context recently recommended that “if someone starts from a position of having a very severe disease, then we would value their improved health more than someone who had something relatively minor wrong with them – even if they ‘improved’ to the same extent,” although the recommendation was not unanimous. Pragmatically, the NICE guidance on the presentation of cost-effectiveness results for its “reference case” is to count a QALY as of equal social value to whomsoever it accrues. The grounds for this are made clear in its guidance. It “reflects the absence of consensus regarding whether these or other characteristics of individuals should result in differential weights being attached to QALYs gained”. This strikes us as neither “vicious” nor “totally inappropriate”. There being no agreed solution to the matter that can be routinely embodied in an algorithm, the matter is referred for deliberation.

AGEISM AND QALYS

Harris may have yet another problem in mind. It may not be the interpersonal comparison that offends, but the efficiency objective—future discounted QALYs and fact that at root this is a measure of prospective life-years. The inventors of QALYs disliked the earlier dependence on outcomes measured simply in terms of life expectancy or years of survival, or numbers surviving to an arbitrary age, and sought to recognise that some lives were not merely short or long, but also agonisingly painful and handicapping or healthy and flourishing. What Harris may be really objecting to is the idea of taking any account at all, in assessing the health gain, of the amount of future time spent in whatever state it is spent in. It is plainly possible to hold the view that five years of future life of a given quality is to be valued the same as a week of life lived at that quality or 50 years lived at that quality. The prevailing view seems to us, however, to have been that people prefer not only good quality life to poor quality life but also more life of a given quality to less (except possibly at very low levels of quality, where there is some evidence that people generally prefer death to such life). If there are particular (or, indeed, general) circumstances under which we would wish to compensate for the shortness of future duration of benefit, then that is better done, we believe, by assigning an explicit weight to the benefits accruing to such people rather than by attaching no moral significance to the different quantities of time. Nothing in QALY methods denies this, just as there is nothing in the QALY methods to insist that it be not done. Similarly, nothing in the QALY methods requires the decision-maker to be blind to the amount of QALYs someone already has, or has had over their lifetime up to the present. What Harris calls “ageism” has little specifically to do with age, but is rather to do with future life spans, which have a variance for a variety of reasons that have nothing to do with the condition to be treated or the procedure being appraised. We think it better to confront that explicitly, as the issue of the moral status to afford to claims of past experience of QALYs or current health, or age, or future expected life without the intervention in question, is better treated not by using algorithms of any kind but openly, as a matter for deliberation. This is, as we understand it, the procedure recommended for NICE, and we again see no warrant whatsoever for describing it as “wicked or totally inappropriate”.

Clearly, the positions Harris takes necessarily discriminate between patients on the basis of different expected benefits or different probabilities of benefiting from healthcare, and these differences depend on the personal characteristics of the people in question. Harris therefore plainly has no principled objection to discrimination per se. Beyond any doubt, there will be some characteristics that society agrees should not be used to discriminate, such as race or class, and we have strongly urged that “being known” versus “being unknown” is also an unacceptable form of discrimination. What we detect in these editorials is a writer with a strong preference or one making a strong value judgement that age should join these characteristics—it should never be used to identify those who can benefit most from healthcare, and society ought to sacrifice possibly substantial health benefits to others to satisfy this preference or value judgement. We cannot account for the preference (if that is what it is), and the grounds for subscribing to the value judgement are not self-evident. It is a value judgement that is certainly not universally held, particularly when the cost to others’ health and opportunities are taken into account. It is certainly not commonly held among clinical practitioners, and the limited empirical work that is available also suggests that it is not commonly held in society. Empirical work to establish the values attached to various concepts of fairness have typically found a variance.22 To claim that any who do not share his value judgement “are ethically illiterate as well as socially divisive” (p 375), are responsible for the “perversion of science as well as of morality” and are “contrary to basic morality and contrary to human rights” is—to say the least—petulant. If Harris wishes to assert that the necessary sacrifices in health benefits would be worthwhile, he should do so and support it with some empirical and coherent theoretical support rather than elevate his own unexplored (albeit oft-repeated) personal values to the height of the universal and denigrate the views of and attribute base motives to those who beg to differ.

CONCLUSIONS

The Institute and its advisory committees have upheld an ethical position that all health benefits and opportunities matter, whether they accrue to identified or unidentified people. They have not succumbed to the heat of direct interests or populist sentiment. Nor have they succumbed to the fetishisation of technologies that claim to modify disease processes and have therefore been unwilling to sacrifice the health outcomes of other unidentified patients to satisfy it. Similarly, on the question of interpersonal comparisons, NICE has not adopted ad hoc value judgements on a piecemeal basis that are neither self-evident nor universally held, particularly when the cost to others’ health and opportunities are considered. Rather, it has engaged in transparent, consultative and deliberative processes for reaching its recommendations. In the absence of consensus on whether particular individual characteristics should result in differential weights being attached to their capacity to benefit from healthcare, the “reference case” embodies a view that all improvements in health are valued equally.

REFERENCES

Footnotes

  • i Harris somewhat disingenuously claims that this charge does not apply to those people who make NICE’s recommendations, but to their “ways of thinking”, regarding which his offensive charge was really an invitation to offer alternatives. The editorials are littered with other personally abusive charges, including one of hypocrisy.

  • Competing interests: KC is a member of the National Institute for Clinical Excellence (NICE) Appraisals Committee and was a member of the working party which recommended NICE’s current methodology for the conduct of economic appraisals. AJC was a member of the NICE Board, which commissioned and accepted this work and, although no longer on the Board, he remains a member of NICE’s Research and Development Committee.

Linked Articles

  • Comment
    J Harris

Other content recommended for you