REPRESENTATIVENESS OF LINKED CLAIMS-EHR DATA: CLAIMS-ONLY VS CLAIMS+EHR POPULATIONS

Author(s)

Lauren E. Parlett, PhD, Amita Ketkar, MS, Judith J. Stephenson, MS, Michael Grabner, PhD, Katherine M. Harris, PhD, Vincent J. Willey, PharmD;
Carelon Research, Wilmington, DE, USA
OBJECTIVES: HEOR often requires the integration of administrative claims with electronic health records (EHR) to provide insights into patient characteristics and clinical and economic outcomes. As integrated data proliferates, questions arise about patient characteristics in the resultant dataset. We compared patient characteristics in a claims-only dataset versus those with both claims and EHR data.
METHODS: The Healthcare Integrated Research Database (HIRD®) is a large US claims & EHR database for health-related research. HIRD members were required to have continuous medical and pharmacy enrollment during 2024. Members with 2024 EHR data were labeled “EHR+”; otherwise they were “EHR-”. Race/ethnicity is self-reported or imputed. Area-level characteristics like socioeconomic status (SES) and urbanicity were linked by residence. For categorical measures, probability distributions were compared using the overlap index (η) where 0% means no overlap and 100% means complete overlap.
RESULTS: Of the 13,970,339 selected members, 24% were EHR+ and 76% were EHR-. Compared to EHR-, EHR+ were generally older (median 49 versus 37 years) and female (58% versus 48%, η=90.7). EHR+ were more likely enrolled in Medicare Advantage (21% versus 5%, η=83.9), in the Midwest (43% versus 19%, η=76.1), and to identify as non-Hispanic White (77% versus 64%, η=86.9). SES (η=94.0) and urbanicity (η=93.8) were comparable. EHR+ had greater prevalence for all Quan-Charlson comorbidity conditions. In 2024, 58% of EHR+ had ≥1 (mean: 3) BMI and 57% had ≥1 (mean: 5.2) blood pressure measurements. EHR+ were more likely to have laboratory results (48% versus 24%; η=88.3), inpatient or ER encounters (inpatient 9% versus 4%, η=97.4; ER 21% versus 12%, η=95.4), and higher all-cause costs (median $3,076 versus $1,007).
CONCLUSIONS: Large, integrated claims plus EHR datasets provide opportunities to address clinically focused evidence needs. Analyses assessing any differences in individuals’ characteristics in the integrated dataset compared to the source population are critical when interpreting and applying study findings.

Conference/Value in Health Info

2026-05, ISPOR 2026, Philadelphia, PA, USA

Value in Health, Volume 29, Issue S6

Code

RWD172

Topic

Real World Data & Information Systems

Topic Subcategory

Reproducibility & Replicability

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×