|
Approaches to Estimating Maximum Study Power for
Economic Endpoints in Fixed Sample Size Designs
Elinor CG Chumney PhD and Kit N. Simpson DrPH, Center for Health Economics and Policy Studies, Colleges of Health Professions
and Pharmacy, Medical University of South Carolina, Charleston, SC, USA
We conducted a workshop at the ISPOR 10th
Annual International Meeting to introduce participants
to the process of using standard biostatistical
software to perform a power analysis for the
endpoints of economic studies when the sample
size has been fixed by clinical or programmatic
constraints. Increasingly, economists are being
asked to perform power analyses to ensure that
they will be able to find meaningful and statistically
significant differences for economic
outcome measures as part of a clinical trial or
proposal.
Many clinical investigators would like to include
economic outcomes as secondary end-points in
their clinical trial proposals. These types of economic
studies are often constrained to a maximum
sample size defined by the clinical sample
size requirements, and/or by design decisions
related to the size of a demonstration project or
program. In these cases the researcher responsible
for the design of the economic parts of the study must perform a power analysis to document
the study’s power to detect meaningful differences
in the economic outcome measures
that will be used. This analysis is especially
important for the development of protocols for
economic studies “piggy backed” onto clinical
trials and for proposals for health care program
evaluation studies.
As in our workshop, we will describe the purposes
of power analyses, review conditions that affect power, and demonstrate the iterative
process that we have used in the design of economic
evaluations in clinical trials and in evaluations
of programs to improve outcomes for
patients with complex chronic conditions.
Economic vs. Clinical Power
Analyses
The disparate goals of economic and clinical
power analyses are highlighted in Table 1.

Economic study power is consistently lower than that of the associated clinical
trial. This is primarily because cost variables tend to have a skewed distribution,
and so a higher variance than clinical outcomes [1]. There is also
enormous variation in the combination of resource inputs (e.g., hospital
days, nursing hours, and various drug therapies) that can go into producing
a unit increase or improvement in a particular health outcome.
Economic study power is also complicated by clinical study exclusion criteria,
various methods for transforming skewed cost data, heterogeneity in the
baseline economic risk factors, and sample selection bias in resource use
measures. It is well known that power increases as sample size increases
ceteris paribus, as seen in Table 2.

However, this is only one of many factors that interact to affect a study’s
power to detect differences in health care costs [2]. These issues follow:
Conducting a Power Analysis
With both clinical and economic power analyses, there are five basic parameters
that must be specified: the level of difference expected, the pre-facto
criteria for a desirable improvement, the acceptable risk of type I and type II
errors, and the expected direction of the difference.
Researchers should address the following questions:
-
How much difference do I expect? Using Glass’ effect size as a guide,
small differences approximate 20% of the standard deviation and large differences
are roughly equal to the standard deviation [3, 4]. It goes without saying
that large differences are easy to detect, whereas small effects are more
difficult;
-
What is the minimum important value?
A 10% to 20% improvement is
usually considered to be the desirable level of improvement for detection.
However, this may be different for cost of illness variables as discussed
below;
-
How much risk of Type I error is acceptable? A 5% level of statistical
significance (corresponding to a 95% level of type I error, the probability of
finding a significant difference when there really isn’t one) is most often
specified;
-
How much risk of Type II error is acceptable? A 10% to 20% risk of type
II error is generally considered acceptable. Recall that type II error is the
probability of not detecting a significant difference when there really is one;
it is equal to 1-power; and
-
Is the direction of the difference specified by the hypotheses? The
researcher should consider whether a clinical improvement or cost reduction
is expected with one therapy. If so, a one-sided test is appropriate; if the direction of the clinical or cost difference is not pre-specified, then a two sided
test should be used.
As in clinical power analyses, economic study power is inversely proportional
to the variance, increasing with a more stable comparison group. It also
increases as the effect size increases, as illustrated in Table 3.

When piggybacking an economic analysis onto a clinical trial, there are a
number of additional questions that should be addressed:
-
What is the average cost of illness? Although payers may be quite
happy with an average cost reduction of $10,000, finding a statistically significant
difference between average costs as of illness as high as $250,000
and $240,000 may require a large sample size, given the often skewed distribution
of cost data. In this case, an increased risk of a type I error may be
acceptable and the significance level may be pre-specified as a=0.10;
-
How will the expected cost offset be distributed? It is quite common to
find cost decreases resulting from the prevention of high cost episodes such
as hospital admissions in a larger proportion of patients in the intervention
group. This will decrease average cost, but often not enough to make a statistically
significant difference. In such cases we need to ask:
-
Is average cost the best measure? If we expect to observe a decrease
in the high cost patients with treatment, it may be better to measure either a
reduction in hospital admissions, a reduction in the length of stay, or a difference
in hospital costs instead of average cost; and
-
What is the best time frame? In some costly chronic conditions such as
HIV/AIDS, we often find expenditure patterns that are very high at the beginning
of a trial due to undiagnosed comorbid conditions that emerge once
treatment starts to improve a patient’s condition. In such cases, it may not
be desirable to use the cost data from the total trial period to test the economic
hypotheses, since this measure would comprise the cost of spillover
undiagnosed events present at baseline as well as the cost of new events
that occur over the course of the trial period. The cost differences may be
better defined as differences in cost that occur after the first 4-8 weeks in the
study. In the past, we have found that this time specification made a 10-fold
improvement in our statistical power.
After these primary factors are identified, the researcher should consider the
following:
-
Log transformation. In the event of skewed economic data, a log transformation
may be an appropriate consideration;
-
Multivariate methods. These can improve
economic study power as baseline heterogeneity
in economic risk may be controlled by variables
such as practice site, country, and patient history
of hospital admission;
-
Key resource indicators. These can focus
research efforts on true cost drivers;
-
Pre-specifying outliers. Specifying the anticipated
cost range prior to the analysis can aid in
identifying cost outliers;
-
Comorbidity costs. Researchers should consider
adding a variable to control for comorbid
conditions at baseline, either the Charlson Index
[5, 6] or a simple question asking patients
whether they have been hospitalized overnight in
the two months before the study period;
-
Practice pattern variations. These can be
especially problematic for multinational trials,
where some site practices are more likely to treat
patients in the hospital. We can control for differences
in the use of high cost factors by classifying
all practice sites as high, medium, or low
users of a specific type of high cost measure. This classification would be made based on their
propensity for use, given patient severity, and
should be computed prior to unblinding;
-
Effect of exposure time. Ceteris paribus, economic
study power increases as the exposure
time increases. However, researchers must balance
this against the fact that increases in exposure
time are often accompanied by increasing
numbers of patients lost to follow-up due to drop
outs or deaths which serve to decrease study
power;
-
Effect of low impact time. With many treatments,
there may be some period of time before
patients begin to respond, and this lead time
should be excluded from the analysis; and
-
Effect of rebound time. Conversely and as
discussed previously, with a number of therapies,
a patients’ condition may initially worsen before
eventually responding to treatment. Again, this
lead time should be excluded from the analysis.
Conclusion
In conclusion, the estimation of economic study
power is a process of exploration and not a single
calculation of a simple parameter. This process
forces the researcher to consider many aspects
of the study design and to perhaps revisit clinical
design decisions that have been made previously.
In some rare cases, the best outcome from an
economic power analysis for a clinical trial may
be the decision not to proceed because critical
design constraints for clinical outcomes prevent
the design of a solid economic component.
However, in most cases, experienced researchers
are able to apply the process of power analysis in
a creative and open-minded fashion to gauge the
most sensitive cost measures.
References
-
Briggs AH. Economic evaluation and clinical trials: size matters. BMJ 2000;321:1362-3.
-
Lipsey MW. Design Sensitivity: Statistical Power for
Experimental Research. Newbury Park, CA: Sage Publications,
1990.
-
Kraemer HC, Thieman S. How Many Subjects: Statistical Power
Analyses in Research. Newbury Park, CA: Sage Publications, 1987.
-
Hedges LV, Olkin I. Statistical Methods in Meta-Analyses. New
York: Academic Press, 1985
-
Charlson ME, Pompei P, Ales KL, McKenzie CR. A new method of
classifying prognostic comorbidity in longitudinal studies: development
and validation. J Chron Dis 1987;40:373-83.
-
Deyo RA, Cherkin DC, Ciol MA. Adapting a clinical comorbidity
index for use with ICD-9-CM administrative databases. J Clin
Epidemiol 1992; 45:613-9.
|