Structured Evaluation of Oncology Real-World Data Quality for Practical Applications
Author(s)
Vivek Verma, PhD, Other1, Pegah Farrokhi, PharmD2, Marcus Lawrance, MS3, Ping Sun, PhD4, Danielle Bargo, PhD5;
1AstraZeneca Canada, Mississauga, ON, Canada, 2University of Minnesota, Minneapolis, MN, USA, 3AstraZeneca Farmacéutica Spain S.A., Madrid, Spain, 4AstraZeneca PLC, Cambridge, United Kingdom, 5AstraZeneca Pharmaceuticals LP, Gaithersburg, MD, USA
1AstraZeneca Canada, Mississauga, ON, Canada, 2University of Minnesota, Minneapolis, MN, USA, 3AstraZeneca Farmacéutica Spain S.A., Madrid, Spain, 4AstraZeneca PLC, Cambridge, United Kingdom, 5AstraZeneca Pharmaceuticals LP, Gaithersburg, MD, USA
Presentation Documents
OBJECTIVES: Assessing the quality of Real-World Data (RWD) is crucial as we increasingly rely on RWD to support regulatory decisions and improve patient outcomes. This study aimed to develop a robust, context-specific framework to quantify RWD quality, particularly for oncology applications. The framework was designed to enable standardized comparisons across RWD sources and systematically evaluate their strengths and limitations.
METHODS: Building on this framework, a comprehensive tool was developed to operationalize the evaluation process by assessing six key dimensions of RWD quality: relevance (alignment with the research question), reliability (accuracy, completeness, and provenance), extensiveness (depth and breadth of data), timeliness (recency and frequency of data collection and curation), coherence (consistency, harmony, and logical alignment), and convenience (ease of access and usability). These dimensions were selected for their significance to data integrity and decision-making. The tool comprises approximately 50 assessment items, refined and weighted through expert input via the Delphi method. Relevance, being highly context-dependent, was assessed separately from the other, more generic data quality attributes. The tool was applied to three prominent RWD sources—Flatiron EDM, Tempus, and Optum Clinformatics—to evaluate their suitability for overall survival analyses in non-small cell lung cancer (NSCLC) patients.
RESULTS: The quality assessment revealed that Flatiron EDM demonstrated the highest relevance, reliability, timeliness, and convenience for the NSCLC use case. Tempus excelled in extensiveness, while Optum Clinformatics emerged as the most coherent data source. These findings illustrate the tool's capability to identify nuanced strengths and limitations across diverse RWD sources.
CONCLUSIONS: This data quality framework presents a systematic and pragmatic approach to evaluating RWD quality, addressing objectivity, generalizability, and scalability challenges. The proposed tool provides actionable insights to guide data selection and utilization, ultimately enhancing the credibility of RWD in decision-making processes. Future work will aim to refine the tool and conduct validation studies.
METHODS: Building on this framework, a comprehensive tool was developed to operationalize the evaluation process by assessing six key dimensions of RWD quality: relevance (alignment with the research question), reliability (accuracy, completeness, and provenance), extensiveness (depth and breadth of data), timeliness (recency and frequency of data collection and curation), coherence (consistency, harmony, and logical alignment), and convenience (ease of access and usability). These dimensions were selected for their significance to data integrity and decision-making. The tool comprises approximately 50 assessment items, refined and weighted through expert input via the Delphi method. Relevance, being highly context-dependent, was assessed separately from the other, more generic data quality attributes. The tool was applied to three prominent RWD sources—Flatiron EDM, Tempus, and Optum Clinformatics—to evaluate their suitability for overall survival analyses in non-small cell lung cancer (NSCLC) patients.
RESULTS: The quality assessment revealed that Flatiron EDM demonstrated the highest relevance, reliability, timeliness, and convenience for the NSCLC use case. Tempus excelled in extensiveness, while Optum Clinformatics emerged as the most coherent data source. These findings illustrate the tool's capability to identify nuanced strengths and limitations across diverse RWD sources.
CONCLUSIONS: This data quality framework presents a systematic and pragmatic approach to evaluating RWD quality, addressing objectivity, generalizability, and scalability challenges. The proposed tool provides actionable insights to guide data selection and utilization, ultimately enhancing the credibility of RWD in decision-making processes. Future work will aim to refine the tool and conduct validation studies.
Conference/Value in Health Info
2025-05, ISPOR 2025, Montréal, Quebec, CA
Value in Health, Volume 28, Issue S1
Code
PT41
Topic
Real World Data & Information Systems
Topic Subcategory
Data Protection, Integrity, & Quality Assurance
Disease
SDC: Oncology