Population Characteristics of a Large, Linked Mother-Child EHR Dataset From Truveta
Author(s)
Sunny Guin, PhD1, Katie Brown, PhD, MSN, RN2, Katherine Gilbert, MPH2, Amy Wu, BS2, Annika Faucon, PhD2, Jordan Swartz, MD2, Emily Webber, PhD2.
1Director of Research Analytics, Truveta, Seattle, WA, USA, 2Truveta, Seattle, WA, USA.
1Director of Research Analytics, Truveta, Seattle, WA, USA, 2Truveta, Seattle, WA, USA.
OBJECTIVES: Pregnant women and their children represent a critical and historically underrepresented population in clinical research. Linking maternal and infant health records enables retrospective studies of treatment safety, utilization, and outcomes across the perinatal and early pediatric period. This study describes the demographic and clinical characteristics of Truveta’s linked mother-child dataset, including the prevalence of key maternal and infant outcomes.
METHODS: Truveta data provides complete, timely, representative, de-identified EHR data comprising over 120 million patients from US health systems. This retrospective observational study included women with pregnancies on or before May 2025 and a deterministically linked child in Truveta Data. Descriptive statistics were calculated using Truveta Studio.
RESULTS: A total of 1,509,087 mothers and 1,921,387 children were identified (mean 1.27 children per mother, SD 0.55). Mothers were primarily white (60.9%) and not Hispanic or Latino (73.1%), with a mean age of 29.9 years (SD 5.7). Comorbidities included gestational diabetes (9.9%) and preeclampsia (5.2%). Linked children were 53.3% white; 63.9% not Hispanic or Latino; 50.9% male; 3.3% twins. Mean birth weight was 7.2 pounds (SD 1.3). Diagnosed outcomes included preterm birth (2.9%), jaundice (25.6%), hypoglycemia (7.0%), and respiratory distress syndrome (2.9%). Among children born to mothers with gestational diabetes, 15.8% had hypoglycemia—more than double the overall rate. Mothers had a mean follow-up of 2.6 years (SD 2.8), and children had 1.9 years (SD 2.6), allowing for longitudinal tracking of care.
CONCLUSIONS: Truveta’s mother-child dataset offers a large, timely, and representative resource to support real-world studies of maternal and infant health across diverse populations. This analysis confirms the elevated risk of neonatal hypoglycemia associated with gestational diabetes and shows maternal comorbidity rates consistent with published estimates. Lower-than-expected rates of other infant outcomes suggest future research should incorporate additional data (e.g., labs and procedures) to improve outcome and cohort completeness.
METHODS: Truveta data provides complete, timely, representative, de-identified EHR data comprising over 120 million patients from US health systems. This retrospective observational study included women with pregnancies on or before May 2025 and a deterministically linked child in Truveta Data. Descriptive statistics were calculated using Truveta Studio.
RESULTS: A total of 1,509,087 mothers and 1,921,387 children were identified (mean 1.27 children per mother, SD 0.55). Mothers were primarily white (60.9%) and not Hispanic or Latino (73.1%), with a mean age of 29.9 years (SD 5.7). Comorbidities included gestational diabetes (9.9%) and preeclampsia (5.2%). Linked children were 53.3% white; 63.9% not Hispanic or Latino; 50.9% male; 3.3% twins. Mean birth weight was 7.2 pounds (SD 1.3). Diagnosed outcomes included preterm birth (2.9%), jaundice (25.6%), hypoglycemia (7.0%), and respiratory distress syndrome (2.9%). Among children born to mothers with gestational diabetes, 15.8% had hypoglycemia—more than double the overall rate. Mothers had a mean follow-up of 2.6 years (SD 2.8), and children had 1.9 years (SD 2.6), allowing for longitudinal tracking of care.
CONCLUSIONS: Truveta’s mother-child dataset offers a large, timely, and representative resource to support real-world studies of maternal and infant health across diverse populations. This analysis confirms the elevated risk of neonatal hypoglycemia associated with gestational diabetes and shows maternal comorbidity rates consistent with published estimates. Lower-than-expected rates of other infant outcomes suggest future research should incorporate additional data (e.g., labs and procedures) to improve outcome and cohort completeness.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
EPH181
Topic
Epidemiology & Public Health, Real World Data & Information Systems, Study Approaches
Topic Subcategory
Public Health
Disease
No Additional Disease & Conditions/Specialized Treatment Areas, Reproductive & Sexual Health