Causal Machine Learning for Assessing Pneumococcal Vaccine Effectiveness: Innovations in Real-World Data Analysis and Confounding Pathway Adjustment
Author(s)
Wilson A1, Gregg M2, Streja E3, Alderden J4, Vanderpuye-Orgle J3, Roessner M3
1Parexel International, Waltham, MA, USA, 2Parexel International, Austin, TX, USA, 3Parexel International, Boston, MA, USA, 4Boise State University, Boise, ID, USA
Presentation Documents
OBJECTIVES: Determining real-world effectiveness from observational data requires careful consideration of the data generation process to account for confounding to compare treatment groups properly. This is especially true when estimating effects in situations with strong health behavior aspects, such as in vaccine effectiveness.
We explore causal assumptions and innovations in causal discovery, utilizing publicly-available real-world data (RWD), to estimate protective effects of pneumococcal vaccination, which helps prevent pneumococcal disease caused by streptococcus pneumoniae bacteria. We demonstrate the need for causal machine learning (ML) methods to supplement traditional methods to de-confound vaccine effectiveness estimates using RWD.METHODS: We leverage MIMIC IV – a deidentified electronic health records dataset from Beth Israel Deaconess Medical Center, capturing admissions from 2008-2019 for nearly 300,000 patients. To estimate the causal effect of vaccination on pneumococcal disease, we employed directed acyclic graphs to identify potential biasing pathways and targeted maximum likelihood estimation (TMLE) to calculate the estimates. Analysis was performed using R software (v 4.3.0).
RESULTS: Initial analysis indicates a paradoxically elevated risk of pneumonia among vaccinated individuals, with a crude odds ratio (OR) of 1.35 (95% CI: 1.27-1.43). However, we detected a strong imbalance of covariates (in particular, age) between treatment groups. Propensity score matching (with enforced caliper matching) was necessary to achieve cohort balance. When accounting for this imbalance and leveraging TMLE, our data revealed a significant protective effect of the vaccine against pneumonia; TMLE-adjusted OR = 0.78 (95% CI: 0.72-0.84).
CONCLUSIONS: Real-world datasets and ML models can provide robust vaccine effectiveness estimates and give insights into the causal relationship between vaccination and disease when properly accounting for confounding pathways. Further exploration into social determinants of health will continue to refine these estimates. Additionally, further exploration into hospital vs. community-acquired and accounting for the type of pneumonia (bacterial vs. viral) will inform personal and policy decisions on vaccination.
Conference/Value in Health Info
Value in Health, Volume 26, Issue 11, S2 (December 2023)
Code
MSR151
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics, Confounding, Selection Bias Correction, Causal Inference
Disease
Respiratory-Related Disorders (Allergy, Asthma, Smoking, Other Respiratory), Vaccines