Optimizing EHR Data Completeness: A Conceptual Framework for Bringing Real-World Data into Clinical Research through Relevant Completeness

Author(s)

Priyanka Ramamurthy, BA, MBA, Ruby Maa, BS, Dan Drozd, MD, MSc.
PicnicHealth, San Francisco, CA, USA.
OBJECTIVES: The successful incorporation of real-world data into clinical research requires a comprehensive understanding of patients' healthcare journeys. We introduce a conceptual framework to address persistent challenges in achieving data completeness, including defining what constitutes "completeness" in electronic health record (EHR) data, identifying and mitigating gaps caused by patient movement between health systems, addressing events either not recorded or not anticipated, and thus not searched for, and developing scalable strategies for facility-specific record requests
METHODS: This framework uses Retrieval Density, a measure of relevant completeness defined as the ratio of retrieved visit records from targeted facilities to the number of signals indicating those care encounters occurred. Key steps include:
    (1) Initial Chart Reviews: Establish disease-, treatment-, and study-specific parameters, conducted manually or with human-in-the-loop methods.
    (2) Retrospective Retrieval: Leverage signals from health data sources (e.g., physician notes, claims data) to access relevant completed visits.
    (3) Prospective Prediction: Create synthetic signals to anticipate relevant patient care
    (4) Patient Engagement: Request provider and visit details directly from patients.
    (5) Data Classification: Differentiate between non-occurrence (event did not happen) and incomplete effort (records not retrieved).
    (6) Bias Management: Implement quality control measures, such as assessments of retrieval density across health status groups.

RESULTS: The framework supports the concept of relevant, rather than exhaustive, completeness, allowing for (1) Improved retrieval of key records through disease-informed methods, (2) Differentiation between types of missing data, enhancing analytical accuracy, and (3) Effective mapping of patient care pathways, to identify and address potential biases. Preliminary implementations demonstrate adaptability to diverse diseases and study designs.
CONCLUSIONS: Achieving complete EHR data for clinical research is not a zero-sum game. By prioritizing relevance, the Retrieval Density framework optimizes effort and mitigates biases. Its adoption is expected to advance the rigor of clinical research and support the integration of RWD into regulatory and decision-making processes.

Conference/Value in Health Info

2025-05, ISPOR 2025, Montréal, Quebec, CA

Value in Health, Volume 28, Issue S1

Code

RWD24

Topic

Real World Data & Information Systems

Topic Subcategory

Distributed Data & Research Networks

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×