COMPARISON OF THE PERFORMANCE OF PATIENT-MEDIATED MEDICAL RECORD RETRIEVAL AND TOKENIZATION-BASED LINKAGE TO GENERATE COMPLETE AND LONGITUDINAL REAL-WORLD DATA
Author(s)
Ashley Cogell, PhD, Reema Patel, MPH;
PicnicHealth, San Francisco, CA, USA
PicnicHealth, San Francisco, CA, USA
OBJECTIVES: This study compared data generation using tokenization-based linkage to a claims data source and patient-mediated medical record retrieval among multiple sclerosis (MS) and hemophilia A (HA) populations.
METHODS: We analyzed data from PicnicHealth registries comprising patient-level data abstracted from medical records for US adults ≥18 years old with ≥1 encounter during a 5-year observation window (September 1, 2019 to September 1, 2024). Eligible patients were tokenized and linked to medical claims, forming a within-patient comparator group from which data generation was compared between medical records and claims as the data source. Descriptive statistics are reported.
RESULTS: We identified 4,669 MS and 254 HA patients enrolled in PicnicHealth’s registries. Linkage at enrollment was high (MS 94.9%; HA 95.7%). PicnicHealth captured approximately 2.0x more median neurology encounters per MS patient than claims (14.0 vs 7.0; p<0.001) and approximately 9x more median hematology encounters per HA patient (35.0 vs 4.0; p<0.001). Longitudinal availability (patients ≥1 encounter of any type) in PicnicHealth and claims were similar through year 2. At year 4, it was higher in claims for MS (73.4% vs 90.2%) but higher in PicnicHealth for HA (90.5% vs 79.8%). Among year 4 observable patients, absence of a specialty encounter in the prior 12 months was higher in claims for both MS (4.5% vs 48.9%) and HA (3.7% vs 58.8%). Broken tokens occurred in 7.8% (MS) and 21.8% (HA); mean time to degradation was ~3.2 years in both.
CONCLUSIONS: Longitudinal completeness varies by data generation method and patient population. Relying on availability of any encounter may overestimate clinically meaningful observable periods in claims. Token degradation highlights the importance of monitoring linkage durability during follow-up.
METHODS: We analyzed data from PicnicHealth registries comprising patient-level data abstracted from medical records for US adults ≥18 years old with ≥1 encounter during a 5-year observation window (September 1, 2019 to September 1, 2024). Eligible patients were tokenized and linked to medical claims, forming a within-patient comparator group from which data generation was compared between medical records and claims as the data source. Descriptive statistics are reported.
RESULTS: We identified 4,669 MS and 254 HA patients enrolled in PicnicHealth’s registries. Linkage at enrollment was high (MS 94.9%; HA 95.7%). PicnicHealth captured approximately 2.0x more median neurology encounters per MS patient than claims (14.0 vs 7.0; p<0.001) and approximately 9x more median hematology encounters per HA patient (35.0 vs 4.0; p<0.001). Longitudinal availability (patients ≥1 encounter of any type) in PicnicHealth and claims were similar through year 2. At year 4, it was higher in claims for MS (73.4% vs 90.2%) but higher in PicnicHealth for HA (90.5% vs 79.8%). Among year 4 observable patients, absence of a specialty encounter in the prior 12 months was higher in claims for both MS (4.5% vs 48.9%) and HA (3.7% vs 58.8%). Broken tokens occurred in 7.8% (MS) and 21.8% (HA); mean time to degradation was ~3.2 years in both.
CONCLUSIONS: Longitudinal completeness varies by data generation method and patient population. Relying on availability of any encounter may overestimate clinically meaningful observable periods in claims. Token degradation highlights the importance of monitoring linkage durability during follow-up.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
SA50
Topic
Study Approaches
Topic Subcategory
Registries
Disease
No Additional Disease & Conditions/Specialized Treatment Areas