The following is a summary of the presentations
from Invited Issue Panel I,
“Will the QALY Survive?” at the ISPOR
11th Annual International Meeting, May
20-24, 2006, Philadelphia, PA, USA
What Does the QALY Measure?
Comments by Daniel Kahneman PhD, Eugene
Higgins Professor of Psychology, Professor of
Psychology and Public Affairs, Woodrow Wilson
School, Princeton University, Princeton, NJ, USA
I have no doubt that the QALY will survive. But as
a psychologist looking at the QALY as a judgment
task where people are asked to evaluate health
states, I wonder about it.
What does the QALY measure? From the best we
know, the QALY measure is not the utility of the
health state of patients, it is the fear of that health
state in the public. I’m concerned about the discrepancy
between the fear people have of the
health state and what I would like to call the facts
of the matter. For example (in a completely different
context), travelers were asked how much
they would be willing to pay for a $100,000
insurance policy in the event of death on an airline
trip to Europe and how much they would be
willing to pay for a $100,000 policy in the event
of death due to a terrorist incident. People are
willing to pay more for the second policy than for
the first. Clearly, that’s absurd. Death for any reason
includes death in a terrorist accident. But
people are willing to pay more for the second
policy than for the first. They are more afraid,
they’re more emotionally aroused by the mention
of terrorism than by the mere mention of death.
To the extent that the a priori fear of the health
state is not really commensurate or is not a very
good predictive measure of what will actually
happen once that state is experienced, then
we’re likely to make a mistake.
In 20 years of looking at people trying to predict
their feelings and the feelings of other people in
various states, including but certainly not
restricted to health states, we’ve encountered
something that we call the focusing illusion. For
example, we asked people what percentage of
the time people in various categories are in a bad
mood. In one instance, we asked about people
working where there was no health insurance.
Twenty-seven percent of those working in a
place with no health insurance are in a bad mood
versus 22% of those working in a place that had
health insurance. The predicted effect is 50%
versus 19%. People exaggerate the difference
between good mood and bad mood. This is an
essentially universal finding. I even have a sort of
Chinese cookie maxim for it: nothing in life is
quite as important as it seems to be while you’re
thinking about it.
The focusing illusion takes a particular shape
when we’re predicting a situation to which there
is adaptation. Some years ago, a student of mine
did a study of paraplegics. I used the same
measure, the percentage of time spent in a bad
mood, but we added the amount of time spent in
the state. The prediction made by people who
knew a paraplegic at one month was 75%. For
one year, it is 60%. The prediction is that there
will be substantial adaptation. When we think of
a state, we tend to think of the initial moment of
the state where the emotional response is most
powerful and adaptation hasn’t taken place.
It seems to me that if we go on using the
responses of the public, then it is absolutely
essential to train the respondents to know a great
deal more about the health state and about the
experiences of people in that health state. What
we will get [in prediction] if we don’t train them
is their fear.
I would have preferred a measure of QALY based
on direct measurements of the experience of
patients [1].That is very problematic and I don’t
think it’s going to happen in the foreseeable
future. But I do think that people who use QALYs
should be very aware of the psychological
research on the QALY task and they should probably
try to do something about adjusting the procedures
to diminish the role of these biases in
the QALY judgment.
What Are Economists Measuring
When They Attempt to Measure
the QALY?
Comments by Alistair McGuire PhD, Chair in
Health Economics, London School of
Economics, London, UK
What are some of the theoretical issues regarding
how economists might use a QALY? What are
some of the problems which occur when you
think about the theory? The underlying theory
itself rests upon expected utility theory (EUT).
EUT doesn’t describe behavior under uncertainty
well. If we look at some of the instruments used
within the health care sector to elicit QALYs, we
find that we’re looking at health states defined
across different dimensions or different attributes
and therefore we have to impose even further
assumptions on these measures relating to
the relationship between preferences as defined
across the different attributes.
It’s difficult to reconcile QALYs with any underlying
theory. And if it’s not based on any theory,
then what is it that economists are really measuring
when they’re attempting to measure QALYs
in the real world? We know that EUT isn’t a very
good description of how people actually behave
when faced with choice, but if I had to stick my
neck out, I’d agree with John Broome, a philosopher-economist, who believes that there really
isn’t any defense of QALYs as a measure of preference
structure as that relates to health states.
In that sense, it’s not a valuation measure at all,
but a measure of some form of health benefit.
It’s a measure of some two-dimensional array of
the benefits which may be derived from any
intervention. That’s as good as any theory gets.
On a practical level, at least when instruments
such as the HUI, the EQ5D, or the SF-36 as it
translates into the QALY are used, there are consistent
returns at the median across different
populations. In other words, we’re not sure what
it’s measuring, but it seems to be measuring
something.
The QALY Is a Useful Index
Comments by Dennis G. Fryback PhD,
Professor of Population Health Sciences,
University of Wisconsin, Madison, WI, USA
I see three reasons that the QALY will survive.
The first is that it serves a purpose. Societal
decisions concerning allocation of resources in
health need the QALY. It is a useful index that
talks about capacity for function. The second is
that it does a pretty good job. We need something
that can command a core of community
agreement. I think that the measurement systems
we have in place do that. We need an index
that can be aggregated across individuals within
the society. We have systematically constructed
indexes using community average weights that
have a really consistent core of agreement.
14 October 15, 2007 ISPOR CONNECTIONS
A QALY is a statistic. It is used to indicate a relative
size of average impact of an intervention in a
defined population. It is a flawed number and we
invest more meaning in it than it deserves, but
it’s a useful index. There are other measures in
the same boat. For example, the Dow Jones
Industrial Average, which we all look at every
day, is a flawed measure. It doesn’t measure the
economy. It is one index, but it gains meaning
because we have experience with it over time.
The GDP as well-they derive their meaning from
both thoughtful and purposeful construction and
also long-term consistent observation.
The third reason is that there is simply no reasonable
alternative. We need something to represent
morbidity and mortality in a reasonably
transparent fashion with substantial agreement
to a scaler index for decisions. We need to have
data that can be collected regularly and on large
scale in reasonably efficient fashion, something
that depends on people to indicate health states
in a community evaluation. We need data in hand
to be largely in advance of decisions. We can’t
go out and mount large scale data collection to
respond to current public policy needs.
We Need a Research Agenda
I think that we need a research agenda. I don’t
want to build yet another index of health. I would
not let the perfect drive out the good. What we
have in hand is pretty good. It’s not perfect, not
by a long shot, but it’s doing a good job for the
major domains of health that we have.
I think that we need to use our research powers
to find better public deliberative processes for
valuing health descriptive systems. We need to
get better group processes on public scales. We
need to collect data about, understand, and integrate
longitudinal observations on people’s
health experiences so that we do have these
data. We need widespread use of existing indexes
for data collection. From these indexes, we’ll
gain meaning, much like the Dow Jones Index.
The QALY: Common Themes
Michael Drummond PhD, Professor of Health
Economics, Centre for Health Economics,
University of York, York, UK (moderator): Does
the panel feel any common themes emerging in
this debate?
Daniel Kahneman PhD: There is less theory to
the QALY than is generally assumed. There are
difficulties collecting valid judgments on health
states by people who are not in those health
states. With respect to feasible alternatives, I
conceded right off the bat that I do not think that
a feasible alternative is going to come out of
nothing. But with respect to changing the way
that QALY data are collected and changing,
improving the ways, that would depend first on
admitting that we are not doing a perfect job and
that we should improve it.
Dennis G. Fryback PhD: Time and time again we
see that people’s assessments of health states
correspond. There’s a common core, the correlation
between on an arbitrary set of health states
or on our own health states will be 0.6, 0.7.
Alistair McGuire PhD: I think that what does
defend the QALY is its consistency in terms of
empirical returns. And the main problem I have
with experience utility is how to aggregate that up
to a societal level, because if everybody’s preferences
are so idiosyncratic in terms of relating
their experience, then society still has to make
decisions and it is hard to aggregate these idiosyncratic
feelings back.
Daniel Kahneman PhD: All the biases that have
been studied by students of judgment and decision-
making are biases on which there is widespread
agreement. So if you use the existence of
agreement as evidence for validity, you would
not recognize that there are biases. We may want
to think of how the measure could be improved,
how it could be better, what an ideal measure
would look like.
Dennis Fryback PhD: I guess I’d like to see where
it’s going wrong in the large.
The QALY: Common Themes
Michael Drummond, PhD: Are there any other
things that the panel would suggest or if we are
going to stick with the QALY, what would we do
to improve it?
Daniel Kahneman PhD: We ought to make sure
that the people who are assessing QALYs have
as much information as possible about the experience
of patients in the health states that they’re
assessing.
Alistair McGuire PhD: I still believe that we have
to get back to decision utility to make societal
choices and hopefully these experience utility
aspects will feed into that. I think that is a
research agenda.
Daniel Kahneman PhD: We need a measure of
decision utility to make decisions. I think we
ought to have that measure informed by the
experience utility of patients.
References
- For more information on experience utility, see Kahneman,
Daniel. Determinants of health economic decisions in actual practice:
the role of behavioral economics. Summary of the presentation
given by Professor Daniel Kahneman at the ISPOR 10th Annual International First
Plenary Session, May 16, 2006, Washington DC, USA. Value
Health 2006;9:65-7