Benchmarking Disease Prevalence in a Large Scale Electronic Health Record Data Network: An Assessment of Chronic and Rare Disease in the United States in 2023
Author(s)
Amanda M. Moore, PharmD, PhD, Matt Scranton, MSc, E Susan Amirian, MSPH, PhD, Jeffrey Brown, PhD;
TriNetX, Cambridge, MA, USA
TriNetX, Cambridge, MA, USA
Presentation Documents
OBJECTIVES: The representativeness of real-world data sources can vary when comparing the observed population to the broader target population. The study objective was to benchmark the 2023 prevalences of common chronic conditions and selected rare diseases, comparing the TriNetX Dataworks-USA network to the US healthcare-seeking population.
METHODS: TriNetX Dataworks-USA is a federated research network of de-identified electronic health records (EHR) data sourced directly from 69 healthcare organizations for over 120 million patients.
The study cohort included all adult patients with a recorded encounter from 01/01/2022 through 12/31/2023. The prevalences of the 10 most common and costly chronic diseases, as reported by the CDC, and selected rare diseases were estimated by dividing the number of the patients, among those with recorded network activity in 2022-2023, with at least one relevant diagnosis code(s) before the end of 2023 by the number of patients with recorded activity from 2022-2023. These prevalences will be presented in context against other population-based studies.
RESULTS: As of December 2024, over 36 million patients aged ≥18 had an encounter between 2022-2023. Of these, 55% were female, 83% with a known race (60% white, 13% Black, 4% Asian, 6% other), 70% with a known ethnicity (8% Hispanic or Latino), and the mean (SD) age was 49.8 (18.7) years. Among adults, the prevalences (CDC estimate) of selected common chronic conditions were: 30.7% hypertension (CDC 32.8%), 28.4% hyperlipidemia (CDC 31.2%), 4.1% COPD (CDC 6.1%), 9.5% asthma (CDC 9.9%), 12.7% diabetes (CDC 12.1%), 14.9% depression (CDC 21.1%). The prevalences of selected rare conditions were: 0.06% cystic fibrosis, 0.15% Ehlers Danlos syndrome, 0.02% Turner’s syndrome.
CONCLUSIONS: The prevalences of common chronic disease observed in the TriNetX Dataworks-USA network is consistent with CDC estimates of the healthcare-seeking US population. Given the network includes many academic medical centers, the prevalences of selected rare diseases may be overrepresented in network.
METHODS: TriNetX Dataworks-USA is a federated research network of de-identified electronic health records (EHR) data sourced directly from 69 healthcare organizations for over 120 million patients.
The study cohort included all adult patients with a recorded encounter from 01/01/2022 through 12/31/2023. The prevalences of the 10 most common and costly chronic diseases, as reported by the CDC, and selected rare diseases were estimated by dividing the number of the patients, among those with recorded network activity in 2022-2023, with at least one relevant diagnosis code(s) before the end of 2023 by the number of patients with recorded activity from 2022-2023. These prevalences will be presented in context against other population-based studies.
RESULTS: As of December 2024, over 36 million patients aged ≥18 had an encounter between 2022-2023. Of these, 55% were female, 83% with a known race (60% white, 13% Black, 4% Asian, 6% other), 70% with a known ethnicity (8% Hispanic or Latino), and the mean (SD) age was 49.8 (18.7) years. Among adults, the prevalences (CDC estimate) of selected common chronic conditions were: 30.7% hypertension (CDC 32.8%), 28.4% hyperlipidemia (CDC 31.2%), 4.1% COPD (CDC 6.1%), 9.5% asthma (CDC 9.9%), 12.7% diabetes (CDC 12.1%), 14.9% depression (CDC 21.1%). The prevalences of selected rare conditions were: 0.06% cystic fibrosis, 0.15% Ehlers Danlos syndrome, 0.02% Turner’s syndrome.
CONCLUSIONS: The prevalences of common chronic disease observed in the TriNetX Dataworks-USA network is consistent with CDC estimates of the healthcare-seeking US population. Given the network includes many academic medical centers, the prevalences of selected rare diseases may be overrepresented in network.
Conference/Value in Health Info
2025-05, ISPOR 2025, Montréal, Quebec, CA
Value in Health, Volume 28, Issue S1
Code
RWD105
Topic
Real World Data & Information Systems
Topic Subcategory
Distributed Data & Research Networks
Disease
No Additional Disease & Conditions/Specialized Treatment Areas