Using Machine Learning to Enable Timely Postpartum Care by Reliably Detecting Deliveries
Author(s)
Evan Sadler, BS, Ian J. Hooley, BS, Jordan Downey, MPH.
Pomelo Care, New York, NY, USA.
Pomelo Care, New York, NY, USA.
OBJECTIVES: Virtual maternity programs in value-based models rely on timely delivery detection from asynchronous data to initiate postpartum support. However, this is confounded by discordant signals from claims, Health Information Exchanges (HIEs), and patient reports. We aimed to develop a machine learning (ML) model to synthesize these data streams, creating a reliable trigger for clinical follow-up and quality improvement.
METHODS: A ground-truth dataset of ~50,000 labeled rows was created by collating potential delivery signals (i.e., >500 medical codes, patient-reported data) and benchmarking them against clinician-verified delivery dates. An XGBoost classification model was trained to interpret complex, non-linear patterns, such as context-dependent signals where a code’s meaning changes with gestational age. To ensure the model generalized to new patients, we used cross-validation with splitting at the pregnancy level, which guarantees that all data from a single pregnancy resides exclusively in either the training or validation set, preventing data leakage.
RESULTS: The model achieved strong performance at both the individual data point and pregnancy levels using 5-fold cross-validation on the holdout data. At the row level, the model achieved 92% accuracy with an AUC of 0.96. More importantly for clinical implementation, at the pregnancy level, where the model's ultimate goal is to correctly identify which pregnancies have resulted in delivery, the model achieved 85% accuracy, 95% precision, and 87% recall with an AUC of 0.89.
CONCLUSIONS: This study demonstrates that non-linear models like XGBoost can overcome the challenge of discordant, multi-source data to create reliable triggers for clinical action. The high precision at the pregnancy level is particularly valuable for clinical workflows, as it minimizes false alerts that could overwhelm care teams and the strong recall ensures most deliveries are detected. Models like this can serve as foundational components for value-based maternity telemedicine programs to improve health outcomes and resource efficiency.
METHODS: A ground-truth dataset of ~50,000 labeled rows was created by collating potential delivery signals (i.e., >500 medical codes, patient-reported data) and benchmarking them against clinician-verified delivery dates. An XGBoost classification model was trained to interpret complex, non-linear patterns, such as context-dependent signals where a code’s meaning changes with gestational age. To ensure the model generalized to new patients, we used cross-validation with splitting at the pregnancy level, which guarantees that all data from a single pregnancy resides exclusively in either the training or validation set, preventing data leakage.
RESULTS: The model achieved strong performance at both the individual data point and pregnancy levels using 5-fold cross-validation on the holdout data. At the row level, the model achieved 92% accuracy with an AUC of 0.96. More importantly for clinical implementation, at the pregnancy level, where the model's ultimate goal is to correctly identify which pregnancies have resulted in delivery, the model achieved 85% accuracy, 95% precision, and 87% recall with an AUC of 0.89.
CONCLUSIONS: This study demonstrates that non-linear models like XGBoost can overcome the challenge of discordant, multi-source data to create reliable triggers for clinical action. The high precision at the pregnancy level is particularly valuable for clinical workflows, as it minimizes false alerts that could overwhelm care teams and the strong recall ensures most deliveries are detected. Models like this can serve as foundational components for value-based maternity telemedicine programs to improve health outcomes and resource efficiency.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
MSR215
Topic
Health Service Delivery & Process of Care, Methodological & Statistical Research, Real World Data & Information Systems
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
Reproductive & Sexual Health