EARLY IDENTIFICATION OF POSTPARTUM DEPRESSION RISK USING ARTIFICIAL INTELLIGENCE
Author(s)
Nehir Yapar, BS1, Selim Onder, MS2, Enes Arikan, MS2, Onur Baser, MA, MS, PhD2;
1Columbia Data Analytics, New York, NY, USA, 2Boğaziçi University, Istanbul, Turkey
1Columbia Data Analytics, New York, NY, USA, 2Boğaziçi University, Istanbul, Turkey
OBJECTIVES: This study aimed to develop and evaluate machine learning (ML) models to predict post-partum depression (PPD) among women with high-risk pregnancies using claims-based data.
METHODS: A retrospective cohort study was conducted using Kythera Labs commercial claims data (2016-2024). Women aged 18-45 with a recorded delivery (live birth, stillbirth, mixed), continuous enrollment for ≥12 months pre- and post-delivery were included. PPD was identified using ICD-10-CM codes (F53.0/F53.1) or proxy new-onset depression diagnoses within 12 months postpartum. Non-PPD women with established risk factors were included in the training and test sets to represent high-risk populations. Baseline demographic, clinical, and obstetric variables were used for model development, including age, comorbidity indices (Charlson comorbidity Index [CCI], Elixhauser Index, Chronic Disease Score [CDS]), obstetric complications, and overall risk factors. ML algorithms evaluated included Logistic Regression, Random Forest, and XGBoost. Models were trained on 70% of the cohort and tested on the remaining 30%. Performance was assessed via AUC-ROC, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and confusion matrices. Feature importance in XGBoost was reported using normalized gain scores.
RESULTS: Total 639,477 patients met study inclusion criteria, of whom 9750 were positive for PPD and 5 were non-PPD patients. The remainder (629,722) was in the at-risk cohort; 32,479 patients were identified as likely to have PPD. Only 23% of patients were actually diagnosed with PPD. The clinical and obstetric risk indicator contributed ~75-80% of total importance (dominant predictor). Three additional features (CDS, age, Elixhauser Index) contributed modestly (~2-5%) and all remaining predictors had negligible importance.
CONCLUSIONS: Only ~1 in 5 postpartum patients were diagnosed. Utilization of ML to predict PPD allows for wider recognition, timely intervention, and targeted treatments to improve or mitigate disease progression and could be used as an initial screening tool.
METHODS: A retrospective cohort study was conducted using Kythera Labs commercial claims data (2016-2024). Women aged 18-45 with a recorded delivery (live birth, stillbirth, mixed), continuous enrollment for ≥12 months pre- and post-delivery were included. PPD was identified using ICD-10-CM codes (F53.0/F53.1) or proxy new-onset depression diagnoses within 12 months postpartum. Non-PPD women with established risk factors were included in the training and test sets to represent high-risk populations. Baseline demographic, clinical, and obstetric variables were used for model development, including age, comorbidity indices (Charlson comorbidity Index [CCI], Elixhauser Index, Chronic Disease Score [CDS]), obstetric complications, and overall risk factors. ML algorithms evaluated included Logistic Regression, Random Forest, and XGBoost. Models were trained on 70% of the cohort and tested on the remaining 30%. Performance was assessed via AUC-ROC, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and confusion matrices. Feature importance in XGBoost was reported using normalized gain scores.
RESULTS: Total 639,477 patients met study inclusion criteria, of whom 9750 were positive for PPD and 5 were non-PPD patients. The remainder (629,722) was in the at-risk cohort; 32,479 patients were identified as likely to have PPD. Only 23% of patients were actually diagnosed with PPD. The clinical and obstetric risk indicator contributed ~75-80% of total importance (dominant predictor). Three additional features (CDS, age, Elixhauser Index) contributed modestly (~2-5%) and all remaining predictors had negligible importance.
CONCLUSIONS: Only ~1 in 5 postpartum patients were diagnosed. Utilization of ML to predict PPD allows for wider recognition, timely intervention, and targeted treatments to improve or mitigate disease progression and could be used as an initial screening tool.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
MSR7
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
SDC: Reproductive & Sexual Health