Taking the Call: Multisource Datasets Speed Real-World Data Fitness Assessments in Oncology
Mary Tran, MS, Data Insights, Syapse, San Francisco, CA, USA
Editor’s Note: In this issue of Value & Outcomes Spotlight we feature a new column wherein readers respond to a previously published article. This article was written in response to a piece published in the May/June issue, “Fit-for-Purpose Real-World Data Assessments in Oncology: A Call for Cross-Stakeholder Collaboration,” by Desai, et al.
An article published by Desai, et al1 that appeared in the May/June 2021 issue of Values & Outcomes Spotlight defined the promise and challenges associated with using real-world evidence (RWE) that draws on real-world data (RWD) sources for health economics and outcomes research (HEOR).
In agreement with Desai, et al, there remains a need for clearly outlined “use-case specifications,” broadly defined as specifications of RWD requirements and criteria to evaluate RWD fitness for use for specific RWE use cases. Given this, it is undeniable that, per the authors, a relevance assessment framework will drive benefits for all stakeholders involved in the use-case specification development and maintenance effort. Certainly, as they’ve written, a cross-stakeholder collaboration is required to arrive at a shared definition of use-case specifications, including relevant quality thresholds and identification of benchmarking resources for validation strategies.
The Use-case specific Relevance and Quality Assessment (UReQA) put forth by Desai, et al is an excellent, accurate framework. Yet, to further streamline assessment and use of RWD by researchers, a preceding broader examination of a database’s makeup and ability to support a spectrum of oncology research needs is proposed. As Desai, et al state, there are many real-world databases and it is a challenge to determine which are appropriate. Those that incorporate a multisource data strategy are more likely to overcome limitations often inherent in individual data sources. In addition to the uses listed by Desai, et al, when a broader dataset is applied, RWD have the potential to support an expansive ecosystem of partners with patient identification for clinical trials, health disparities and outcomes research, tailoring optimal treatment regimens, understanding distinct populations, handoffs from nononcology and oncology providers, and developing more cost-efficient external control arms. The ability to leverage one rich dataset to answer multiple questions promotes efficiency and therefore time and cost savings.
Understanding the Inherent Strengths and Limitations of Individual RWD Sources Before the Use Case
As outlined by Desai, et al, regulatory and payer guidelines have highlighted “fitness for use,” also known as “fitness for purpose,” as a key factor that drives the choice of RWD and analytic methods for RWE generation. In determining fitness for use, questions about a particular RWD source can range from quantitative in nature (eg, patient counts and percent missingness) to qualitative (eg, data quality and population similarity). Building upon the UReQA framework, to streamline determination of fitness for use of a particular use case, we propose the need for a qualitative and quantitative deep dive into distinct RWD sources’ data quality even prior to determining specific use cases.
Both dimensions of data quality (reliability and relevance) may be applied to RWD source evaluation prior to the specific use case, instead taking into account the broad array of oncology research questions first—from care prior to cancer diagnosis to outcomes. In determining reliability, completeness, accuracy, and consistency are evaluated; while relevance is thought of in terms of recency, representation, and historical capture. No single data source passes all quality dimensions. For example, hospital tumor registries are highly regarded and very reliable, but the existence of certain elements (eg, recurrence, biomarkers, safety events) critical for select research questions might be absent depending on the individual registry and as such they too are limited in longitudinally and comprehensiveness. Claims data may lack key patient characteristics and presentations relevant for study questions. Data sources originating from the outpatient care setting lack the full scope of care that take place within the inpatient hospital setting. Because all RWD sources have their limitations, one RWD source is less likely to meet fit for purpose parameters on its own. Researchers should evaluate the specific strengths and limitations of individual data sources to determine appropriate research questions and use cases that can be addressed and then, what other data might be needed. The analogy of Swiss cheese is helpful here: every source has gaps throughout, just as every slice of Swiss cheese has holes throughout. But if you place enough slices of Swiss cheese on top of each other, you’re likely to fill in all the gaps.
A Multisource RWD Strategy to Promote Sustainability and Efficiency
Once we have a clear picture of each individual RWD source, we can begin to build a multisource RWD strategy that brings disparate, overlapping data sources together into a single comprehensive view of the patient journey that can be used to inform a variety of research needs.
The ability to leverage one dataset to answer multiple questions promotes efficiency and therefore time and cost savings. In this way, the suitability and sustainability of a particular dataset for a partner organization may lie in its ability to meet that organization’s diverse research needs. However, diverse research needs compound the existing challenges to working with heterogeneous RWD. It should be noted that gathering multiple large datasets onto a single platform is not the same as linking and integrating information data into a single patient record, which may require collaboration with health systems with multiple electronic medical records, sourcing from different systems internally, as well as laboratories at varying levels of sophistication and wide diversity and reporting standards. Layering multiple sources together is a more complex, challenging undertaking than relying on a single RWD source for insights, but it enables us to develop a much more complete picture of each patients’ cancer care journey.
For these reasons, we propose furthering the UReQA framework with additional metrics for the assessment of RWD fitness for use—particularly under pre-assessment, whether multiple data sources are employed. Under relevance, in addition to representation, it is pertinent to ask whether there is a comprehensive list of data elements (exposures, outcomes, covariates), whether there is longitudinally to reflect the full course of care and patient response, and are the data recent to adequately reflect actual outcomes?
While not generated for the purpose of research use, if appropriately handled and analyzed, RWD can unlock immense value. At a high level, RWD support diverse research needs across the healthcare continuum, including by:
• Helping to clarify how therapies perform in real-world populations that are underrepresented in clinical trials (eg, in minority communities patients with comorbidities and the aging population).
• Helping providers identify and close gaps in care to ensure every patient is given their best shot at managing their disease effectively.
Every dataset certainly has a limitation but with collaboration there is a way to fully realize the potential of RWD. In doing so, not only data quality but also methodological and analytical robustness must be considered to unlock the potential of RWD. Underlying the potential success of all these efforts is transparency and ongoing discussion and collaboration at an industry level to move the field forward.
1. Desai K, Chandwani S, Ru B, Reynolds MW, Christian JB, Estiri H. Fit-for-purpose real-world data assessments in oncology: a call for cross-stakeholder collaboration. Value & Outcomes Spotlight. 2021;7(3):34-37.