AN ALGORITHM TO IDENTIFY LESS INVASIVE SURFACTANT ADMINISTRATION USING A REAL-WORLD DATABASE OF PRETERM INFANTS
Author(s)
Xuezheng Sun, PhD1, Annie Simpson, PhD2, Aditi Lahiri, MPH3, Sanjida Mowla4, Dana Edelman, MPH3, Shelby Corman, PharmD, MS, BCPS2, Daniel Fuentes, Pharm.D.4, Dalibor Kurepa, MD5, Michael Kuzniewicz, MD, MPH3;
1Chiesi USA, Cary, NC, USA, 2Precision AQ, Bethesda, MD, USA, 3Kaiser Permanente Northern California, Oakland, CA, USA, 4Chiesi, Cary, NC, USA, 5Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA
1Chiesi USA, Cary, NC, USA, 2Precision AQ, Bethesda, MD, USA, 3Kaiser Permanente Northern California, Oakland, CA, USA, 4Chiesi, Cary, NC, USA, 5Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA
OBJECTIVES: Surfactant replacement therapy is central for management of respiratory distress syndrome (RDS) in preterm infants. Less invasive surfactant administration (LISA) has been increasingly adopted due to improved neonatal outcomes. However, the absence of procedure codes in real-world data (RWD) limits large scale evaluation of these methods. This study aimed to develop an algorithm to identify LISA procedures using administrative data.
METHODS: We conducted a retrospective study using chart reviews as the gold standard to identify preterm infants receiving surfactant via LISA or non-LISA procedures across Kaiser Permanente Northern California facilities. Eighty-two candidate variables were selected from administrative data between birth and first surfactant administration. The algorithm was developed using births between 2019 and 2023 and randomly split into a training set (n=884) and testing set (n=379). A least absolute shrinkage and selection operator (LASSO) regression was used for variable selection and model fitting. Model discrimination was evaluated using area under the receiver operating characteristic (AUROC). Algorithm performance was validated using a combined sample of the testing set and a 2024 cohort (n=622) overall and by gestational age using specificity, positive predictive value (PPV) and negative predictive value (NPV).
RESULTS: Among 1,263 preterm infants who received surfactant, 462 (36.6%) received surfactant via LISA and 801 (63.4%) via invasive modalities. The LASSO-based model selected 21 variables predictive of LISA based on the training set. The model demonstrated strong discrimination (AUROC=0.87). Using the maximum specificity cut-point (predicted probability ≥0.79), the model achieved sensitivity=43.9%, specificity=96.8%, PPV=90.0% and NPV=72.5%, with an agreement of 75.9% when evaluated in the combined cohort. Sensitivity and specificity were consistent across gestational age subgroups.
CONCLUSIONS: Using real-world administrative data, we developed a machine-learning algorithm that accurately identifies LISA among preterm infants. Its strong performance supports future research to evaluate the utilization and outcomes of LISA using RWD.
METHODS: We conducted a retrospective study using chart reviews as the gold standard to identify preterm infants receiving surfactant via LISA or non-LISA procedures across Kaiser Permanente Northern California facilities. Eighty-two candidate variables were selected from administrative data between birth and first surfactant administration. The algorithm was developed using births between 2019 and 2023 and randomly split into a training set (n=884) and testing set (n=379). A least absolute shrinkage and selection operator (LASSO) regression was used for variable selection and model fitting. Model discrimination was evaluated using area under the receiver operating characteristic (AUROC). Algorithm performance was validated using a combined sample of the testing set and a 2024 cohort (n=622) overall and by gestational age using specificity, positive predictive value (PPV) and negative predictive value (NPV).
RESULTS: Among 1,263 preterm infants who received surfactant, 462 (36.6%) received surfactant via LISA and 801 (63.4%) via invasive modalities. The LASSO-based model selected 21 variables predictive of LISA based on the training set. The model demonstrated strong discrimination (AUROC=0.87). Using the maximum specificity cut-point (predicted probability ≥0.79), the model achieved sensitivity=43.9%, specificity=96.8%, PPV=90.0% and NPV=72.5%, with an agreement of 75.9% when evaluated in the combined cohort. Sensitivity and specificity were consistent across gestational age subgroups.
CONCLUSIONS: Using real-world administrative data, we developed a machine-learning algorithm that accurately identifies LISA among preterm infants. Its strong performance supports future research to evaluate the utilization and outcomes of LISA using RWD.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
RWD25
Topic
Real World Data & Information Systems
Topic Subcategory
Health & Insurance Records Systems
Disease
SDC: Pediatrics, SDC: Respiratory-Related Disorders (Allergy, Asthma, Smoking, Other Respiratory)