Advancing Causal Inference With Machine Learning and Real-World Data: An Application of Targeted Machine Learning and Super Learners on Hospital-Acquired Pressure Injuries From MIMIC IV

Author(s)

Wilson A¹, Gregg M², Streja E³, Alderden J⁴, Vanderpuye-Orgle J³, Roessner M³
¹Parexel International, Waltham, MA, USA, ²Parexel International, Austin, TX, USA, ³Parexel International, Boston, MA, USA, ⁴Boise State University, Boise, ID, USA

Presentation Documents

MSR102 ISPOR EU23 Pressure Injuries_print-file132155.pdf

OBJECTIVES: Traditional causal methods typically rely on parametric statistical models that impose restrictive assumptions about underlying data structure. Recent advancements in targeted machine learning (ML) and super learning enable the identification of causal estimates in real-world data, irrespective of data complexity or structure, offering a flexible and comprehensive approach to causal inference. In our study, we leverage the MIMIC IV database and ML models to ascertain the causes of hospital-acquired pressure injuries (HAPrI) and develop a risk prediction algorithm.

METHODS: Utilizing the MIMIC IV dataset – a deidentified electronic health records dataset from Beth Israel Deaconess Medical Center, capturing admissions from 2008-2019 for nearly 300,000 patients – we used cost-sensitive ensemble super learning to predict HAPrI in the ICU. We then estimated the potential causal effect of albumin on HAPrI development via clinically-informed debiasing methods, including directed acyclic graphs and targeted maximum likelihood estimation (TMLE).

RESULTS: Of 28,395 eligible cases, 1,395 developed a pressure injury (4.9%). The ensemble super learner had a cross-validated AUC of 0.8, with 45.6% sensitivity and 88.8% specificity. The crude odds ratio of low (below 3.0) albumin on pressure injury was significant: OR = 2.86, p<0.0001. The TMLE-adjusted estimate was significant but attenuated: OR = 2.22, p<0.0001.

CONCLUSIONS: Previous models predicting pressure injuries often favour overall accuracy and have a clinically uninformative sensitivity, whereas traditional Braden scales classify almost all patients as high-risk. Our results suggest that ML methods can be used to develop accurate risk prediction algorithms for HAPrI. We also identified a significant (causal) effect of low albumin levels on the development of pressure injuries. These findings demonstrate how ML methods generate valuable causal and predictive models and improve our understanding of data interdependencies. The future of clinical prediction lies at the intersection of accuracy, clinical wisdom, and machine learning's amplified use of real-world data.

Conference/Value in Health Info

2023-11, ISPOR Europe 2023, Copenhagen, Denmark

Value in Health, Volume 26, Issue 11, S2 (December 2023)

Code

MSR102

Topic

Methodological & Statistical Research, Real World Data & Information Systems, Study Approaches

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Electronic Medical & Health Records, Health & Insurance Records Systems

Disease

Injury & Trauma, No Additional Disease & Conditions/Specialized Treatment Areas

Explore Related HEOR by Topic

Presentation