The Official News & Technical Journal Of The International Society For Pharmacoeconomics And Outcomes Research
ECONOMIC EVALUATION

Approaches to Estimating Maximum Study Power for Economic Endpoints in Fixed Sample Size Designs

Elinor CG Chumney PhD and Kit N. Simpson DrPH, Center for Health Economics and Policy Studies, Colleges of Health Professions and Pharmacy, Medical University of South Carolina, Charleston, SC, USA


We conducted a workshop at the ISPOR 10th Annual International Meeting to introduce participants to the process of using standard biostatistical software to perform a power analysis for the endpoints of economic studies when the sample size has been fixed by clinical or programmatic constraints. Increasingly, economists are being asked to perform power analyses to ensure that they will be able to find meaningful and statistically significant differences for economic outcome measures as part of a clinical trial or proposal.

Many clinical investigators would like to include economic outcomes as secondary end-points in their clinical trial proposals. These types of economic studies are often constrained to a maximum sample size defined by the clinical sample size requirements, and/or by design decisions related to the size of a demonstration project or program. In these cases the researcher responsible for the design of the economic parts of the study must perform a power analysis to document the study’s power to detect meaningful differences in the economic outcome measures that will be used. This analysis is especially important for the development of protocols for economic studies “piggy backed” onto clinical trials and for proposals for health care program evaluation studies.

As in our workshop, we will describe the purposes of power analyses, review conditions that affect power, and demonstrate the iterative process that we have used in the design of economic evaluations in clinical trials and in evaluations of programs to improve outcomes for patients with complex chronic conditions.

Economic vs. Clinical Power Analyses
The disparate goals of economic and clinical power analyses are highlighted in Table 1.

Economic study power is consistently lower than that of the associated clinical trial. This is primarily because cost variables tend to have a skewed distribution, and so a higher variance than clinical outcomes [1]. There is also enormous variation in the combination of resource inputs (e.g., hospital days, nursing hours, and various drug therapies) that can go into producing a unit increase or improvement in a particular health outcome.

Economic study power is also complicated by clinical study exclusion criteria, various methods for transforming skewed cost data, heterogeneity in the baseline economic risk factors, and sample selection bias in resource use measures. It is well known that power increases as sample size increases ceteris paribus, as seen in Table 2.

However, this is only one of many factors that interact to affect a study’s power to detect differences in health care costs [2]. These issues follow:

Conducting a Power Analysis
With both clinical and economic power analyses, there are five basic parameters that must be specified: the level of difference expected, the pre-facto criteria for a desirable improvement, the acceptable risk of type I and type II errors, and the expected direction of the difference.

Researchers should address the following questions:

  1. How much difference do I expect? Using Glass’ effect size as a guide, small differences approximate 20% of the standard deviation and large differences are roughly equal to the standard deviation [3, 4]. It goes without saying that large differences are easy to detect, whereas small effects are more difficult;

  2. What is the minimum important value? A 10% to 20% improvement is usually considered to be the desirable level of improvement for detection. However, this may be different for cost of illness variables as discussed below;

  3. How much risk of Type I error is acceptable? A 5% level of statistical significance (corresponding to a 95% level of type I error, the probability of finding a significant difference when there really isn’t one) is most often specified;

  4. How much risk of Type II error is acceptable? A 10% to 20% risk of type II error is generally considered acceptable. Recall that type II error is the probability of not detecting a significant difference when there really is one; it is equal to 1-power; and

  5. Is the direction of the difference specified by the hypotheses? The researcher should consider whether a clinical improvement or cost reduction is expected with one therapy. If so, a one-sided test is appropriate; if the direction of the clinical or cost difference is not pre-specified, then a two sided test should be used.

As in clinical power analyses, economic study power is inversely proportional to the variance, increasing with a more stable comparison group. It also increases as the effect size increases, as illustrated in Table 3.

When piggybacking an economic analysis onto a clinical trial, there are a number of additional questions that should be addressed:

  1. What is the average cost of illness? Although payers may be quite happy with an average cost reduction of $10,000, finding a statistically significant difference between average costs as of illness as high as $250,000 and $240,000 may require a large sample size, given the often skewed distribution of cost data. In this case, an increased risk of a type I error may be acceptable and the significance level may be pre-specified as a=0.10;

  2. How will the expected cost offset be distributed? It is quite common to find cost decreases resulting from the prevention of high cost episodes such as hospital admissions in a larger proportion of patients in the intervention group. This will decrease average cost, but often not enough to make a statistically significant difference. In such cases we need to ask:

  3. Is average cost the best measure? If we expect to observe a decrease in the high cost patients with treatment, it may be better to measure either a reduction in hospital admissions, a reduction in the length of stay, or a difference in hospital costs instead of average cost; and

  4. What is the best time frame? In some costly chronic conditions such as HIV/AIDS, we often find expenditure patterns that are very high at the beginning of a trial due to undiagnosed comorbid conditions that emerge once treatment starts to improve a patient’s condition. In such cases, it may not be desirable to use the cost data from the total trial period to test the economic hypotheses, since this measure would comprise the cost of spillover undiagnosed events present at baseline as well as the cost of new events that occur over the course of the trial period. The cost differences may be better defined as differences in cost that occur after the first 4-8 weeks in the study. In the past, we have found that this time specification made a 10-fold improvement in our statistical power.

After these primary factors are identified, the researcher should consider the following:

  • Log transformation. In the event of skewed economic data, a log transformation may be an appropriate consideration;

  • Multivariate methods. These can improve economic study power as baseline heterogeneity in economic risk may be controlled by variables such as practice site, country, and patient history of hospital admission;

  • Key resource indicators. These can focus research efforts on true cost drivers;

  • Pre-specifying outliers. Specifying the anticipated cost range prior to the analysis can aid in identifying cost outliers;

  • Comorbidity costs. Researchers should consider adding a variable to control for comorbid conditions at baseline, either the Charlson Index [5, 6] or a simple question asking patients whether they have been hospitalized overnight in the two months before the study period;

  • Practice pattern variations. These can be especially problematic for multinational trials, where some site practices are more likely to treat patients in the hospital. We can control for differences in the use of high cost factors by classifying all practice sites as high, medium, or low users of a specific type of high cost measure. This classification would be made based on their propensity for use, given patient severity, and should be computed prior to unblinding;

  • Effect of exposure time. Ceteris paribus, economic study power increases as the exposure time increases. However, researchers must balance this against the fact that increases in exposure time are often accompanied by increasing numbers of patients lost to follow-up due to drop outs or deaths which serve to decrease study power;

  • Effect of low impact time. With many treatments, there may be some period of time before patients begin to respond, and this lead time should be excluded from the analysis; and

  • Effect of rebound time. Conversely and as discussed previously, with a number of therapies, a patients’ condition may initially worsen before eventually responding to treatment. Again, this lead time should be excluded from the analysis.

Conclusion
In conclusion, the estimation of economic study power is a process of exploration and not a single calculation of a simple parameter. This process forces the researcher to consider many aspects of the study design and to perhaps revisit clinical design decisions that have been made previously. In some rare cases, the best outcome from an economic power analysis for a clinical trial may be the decision not to proceed because critical design constraints for clinical outcomes prevent the design of a solid economic component. However, in most cases, experienced researchers are able to apply the process of power analysis in a creative and open-minded fashion to gauge the most sensitive cost measures.

References

  1. Briggs AH. Economic evaluation and clinical trials: size matters. BMJ 2000;321:1362-3.
  2. Lipsey MW. Design Sensitivity: Statistical Power for Experimental Research. Newbury Park, CA: Sage Publications, 1990.
  3. Kraemer HC, Thieman S. How Many Subjects: Statistical Power Analyses in Research. Newbury Park, CA: Sage Publications, 1987.
  4. Hedges LV, Olkin I. Statistical Methods in Meta-Analyses. New York: Academic Press, 1985
  5. Charlson ME, Pompei P, Ales KL, McKenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chron Dis 1987;40:373-83.
  6. Deyo RA, Cherkin DC, Ciol MA. Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases. J Clin Epidemiol 1992; 45:613-9.

  Issues Index | 2006 Issues Index  

Contact ISPOR @ info@ispor.org  |  View Legal Disclaimer
©2010 International Society for Pharmacoeconomics and Outcomes Research.
All rights reserved under International and Pan-American Copyright Conventions.
 
Website design by Eagle Systems USA, Inc.