Trial Emulation With Real-World Data: Evidence on Feasibility, Challenges, and Opportunities
Tyler D. Wagner, PharmD, Virginia Commonwealth University School of Pharmacy, Richmond, VA, USA; Jacinda Tran, PharmD, MBA, Comparative Health Outcomes, Policy, and Economics (CHOICE) Institute, University of Washington, Seattle, WA, USA
On the final day of the conference, Marc Berger, MD (Consultant, USA), moderated a discussion around the feasibility, challenges, and opportunities that surround trial emulation with real-world data (RWD). This session was part of the RWE subprogram for the ISPOR 2022 conference.
Shirley Wang, PhD (Brigham and Women's Hospital and Harvard Medical School, USA) began the discussion with an explanation of RCT-DUPLICATE, which is a demonstration project designed to understand and improve the validity of real-world evidence (RWE) studies to support decision making. The project emulated and predicted randomized controlled trials (RCTs), considered how to conduct transparent and reproducible RWE studies, and identified factors that increased the validity of RWE studies. Wang specified that the purpose of DUPLICATE was to look at the calibration results between database studies and trials when we are able to emulate those trials well to allow for a fair comparison. However, we can still learn from studies in which we have difficulties emulating certain aspects of trial design.
Wang focused a segment of her discussion on the differentiation between emulation differences and bias. She posed the question of “are we asking a different question?”, because there are going to be differences between a clinical trial and a database study. Emulation differences are defined as the differences between RCTs and RWE. Highlighting key characteristics from a 2020 publication from Franklin et al, emulation differences can have differences in the population, even when the same inclusion-exclusion criteria are applied, as often differences in population distribution are observed. There are also differences in treatment strategy, with some trials having specific protocols that include loading doses, dose titrations, or procedures in place to maximize adherence that is not emulable with clinical practice data. On the other hand, bias reflects differences between RWE treatment arms within the database study. These are differences in how the population is ascertained and differences in outcomes or follow-up within the study. Wang stated that that “for any trial emulation, there are going to be a mix of emulation differences as well as biases, making it difficult to tease them apart.”
"When data are fit-for-purpose and proper study design and analysis are employed, RWE studies may be able to have similar conclusions to RCTs about treatment effects." —Shirley Wang, PhD
Wang presented a case study focused on the time-varying treatment effects related to the HORIZON-PIVOTAL trial, an RCT looking at zoledronic acid versus placebo in patients with osteoporosis with an outcome of hip fracture. For the database, RWE study, the authors looked at zoledronic acid versus raloxifene. Their findings indicated that there was an emulation difference where in clinical practice, observed patients were on treatment for a very short period of time, whereas in trials, patients were observed for a long follow-up with time varying treatment effects. This work demonstrated that it can be challenging to replicate trial findings when the treatment effect is delayed and that clinical practice patients may not experience the full benefit seen in explanatory trials.
To further hit home this point, Wang discussed that because certain challenges are expected to subtly shift the target question for database studies compared to the RCT, the study team separated their challenges into two groups—one with few challenges in emulating trial design and a second with more substantive challenges expected to subtly change the target question that was being addressed. The team found that few challenges led to closer agreement in effect estimates and high correlation and agreement, but more substantive challenges resulted in low correlation and less agreement in RCT/RWE effect estimates.
Wang concluded by highlighting a couple of take-home points. She emphasized that RWE studies come to the same conclusion as RCTs when they are able to emulate well and target the same question, evaluation of replicability can be nuanced, and that we should consider a hypothetical target trial that would pragmatically address the needs of end users. She emphasized that when data are fit-for-purpose and proper study design and analysis are employed, RWE studies may be able to have similar conclusions to RCTs about treatment effects.
Berger spoke next, presenting the slides and work of William Crown, PhD, who was unable to attend. Berger highlighted some of the learnings from clinical trial emulations. He emphasized that the goal of emulating trials is not to show that it is possible to get the same answer, but that it helps us understand when it is possible to replicate findings from the target trial and when it is not. The vast majority of trials cannot be closely emulated with RWD due to the granularity and missingness of data and that it is critical to get the study design right. Further, the complex treatment regimens and clinical inclusion/exclusion criteria inhibit most clinical trial emulations. Additional learnings that have come from clinical trial emulation efforts include that it is possible to estimate similar treatment effects with observational data, it is reasonable to expect that data necessary to emulate economic target trials is generally available in observational study data, and that variability in researcher decision making is understudied.
"It can be challenging to replicate trial findings when the treatment effect is delayed and that clinical practice patients may not experience the full benefit seen in explanatory trials."
Next, Seamus Kent, PhD (National Institute for Health and Care Excellence [NICE], UK) addressed the value of target trial emulation and the potential for the use of RWE in health technology assessment (HTA). He discussed where emulation studies can offer the greatest value in HTA now and in the future. He elaborated that it offers the value when comparing trial data with RWD, but emulating is more challenging at launch because RWD hasn’t yet accrued or is extremely limited, and there are fundamentally different processes of care and data generation. Post-launch, emulation studies offer value when RWD has been accrued, to build upon trial evidence and to extrapolate additional questions that may be of interest to HTA bodies such as extending trial data to patient populations excluded from trials and extending to new indications. In order to use RWD more routinely in emulation, we still need to understand the following: (1) the likelihood of success if there are no prior RCTs, (2) where emulation is and is not likely to work as an approach, (3) how it transfers across contexts, and (4) trade-offs between study design. Emulation studies have added value, but there is more work to be done!
To conclude the session, the presenters were asked if they can trust RWD, if it is fit for purpose, and if it is regulatory-grade data. Wang and Kent both responded by indicating that it really depends on the research question at hand, if the study design is aligned appropriately, and that it will often be on a case-by-case basis. And while there are many situations where you cannot emulate a trial, when you are able to emulate for one indication, there is the possibility to apply the methods to emulate an additional indication. If you ask the right question, any dataset can provide useful information, and more emulation studies are needed to add to the literature and to better address where emulation is and isn’t likely to work.