IMPACT OF METHOD CHOSEN FOR DEFINING OBSERVABLE TIME IN LINKED OPEN DATA SOURCES

Author(s)

Anna Swenson, MPH, Gursimran Basra, MCA, Paul Buzinec, MS, Kathryn Starzyk, BA, MSc;
OM1, Boston, MA, USA
OBJECTIVES: Defining observable time in real-world data sources without enrollment information is an important study design consideration. We explore how varying methods for defining observation periods in linked EMR and open claims data impact sample size, comorbidity prevalence, medication usage, and HCRU across three conditions.
METHODS: Eligible patients were identified from OM1 RWDC curated clinical datasets with linked EMR and open medical claims in Atopic Dermatitis (AD), Rheumatoid Arthritis (RA), and Major Depressive Disorder (MDD). We compared three observability methods in linked data: Method 1: Persistence window (gap < 548 days), requiring complete EMR/claims overlap. Method 2: Encounter-based (≥1 EMR & claims encounters within 12-months post-index; any encounter >12-months post-index). Method 3: Modified Encounter-based (≥1 EMR & claims encounters within 12-months post-index; ≥1 EMR & claims encounters >12-months post-index). Demographics, comorbidities, medications, and HCRU were compared.
RESULTS: Initial patient counts were 94,368 (AD), 282,850 (RA), and 1,063,161 (MDD). After applying observability criteria, Method 1 resulted in the largest sample size reductions (-59.6%[RA] to -74.7%[MDD]), Method 2 the smallest (-36.2%[RA] to -46.8%[MDD]), and Method 3 intermediate (-56.5%[RA] to -66.7%[MDD]). Mean age and Charlson Comorbidity Index ≥2 were highest for Method 1 and lowest in Method 2. While comorbidity and medication prevalence varied only slightly across methods, outpatient visit counts showed larger differences with Method 1 having the highest counts and Method 2 the lowest. Although results were largely consistent across conditions, the magnitude of differences varied.
CONCLUSIONS: Specifying observation periods requires trade-offs between data completeness and sample representativeness. Stricter definitions of linked data availability yielded smaller, older, and sicker populations with more complete data. Method choice must align with study goals to ensure fit-for-purpose data, and selection bias should be carefully assessed and mitigated with attention to condition & data source specific characteristics.

Conference/Value in Health Info

2026-05, ISPOR 2026, Philadelphia, PA, USA

Value in Health, Volume 29, Issue S6

Code

RWD31

Topic

Real World Data & Information Systems

Disease

No Additional Disease & Conditions/Specialized Treatment Areas, SDC: Musculoskeletal Disorders (Arthritis, Bone Disorders, Osteoporosis, Other Musculoskeletal), SDC: Sensory System Disorders (Ear, Eye, Dental, Skin)

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×