Using Real-World Data and Machine Learning to Identify Patients at Highest Risk for Hospitalization Following Respiratory Syncytial Virus Infection
Author(s)
Zuzanna Drebert, PhD, Julia A. O'Rourke, PhD, E Susan Amirian, MSPH, PhD, Mike Temple, MD;
TriNetX, LLC, Cambridge, MA, USA
TriNetX, LLC, Cambridge, MA, USA
OBJECTIVES: Respiratory syncytial virus (RSV) infection is a common cause of respiratory illness in adults, posing a greater risk to patients with chronic medical conditions than to healthy individuals. In 2023, the U.S. Food and Drug Administration (FDA) approved two RSV vaccines for older adults and The Centers for Disease Control and Prevention (CDC) recommends vaccination for adults aged 75 years and older and adults aged 60-74 years with chronic medical conditions. This work aimed to leverage machine learning (ML) to explore how variables derived from electronic health records (EHR) influence the risk of RSV-related hospitalization. We used Explainable AI to understand how these clinical and healthcare utilization characteristics contribute to this risk.
METHODS: This study used a US-based research network of 66 Healthcare Organizations (HCOs) and 116 million patients, developed and maintained by TriNetX. The cohort included 43,423 adults 59 years old or older with RSV infection diagnosis between January 1st, 2019, and June 30th, 2024. We used Gradient Boosting ML algorithm (80% / 20% training to test split) with 232 demographics, diagnoses, procedures, laboratory, and medications variables selected by the clinical research team to train the model. SHAP (Shapley Additive Explanations) explored model decisions and assigned relative feature importance.
RESULTS: Over a third (34.2%) of the patients were admitted to the hospital within one day after the initial diagnosis. SHAP analysis revealed that resource utilization patterns are more significant predictors of RSV-related hospitalizations than age and chronic conditions. Patients with a history of hospitalizations and lack of outpatient visits were more likely to be hospitalized (variables ranked 1st and 2nd based on SHAP feature importance, with values 0.306 and 0.281, respectively).
CONCLUSIONS: Patterns of healthcare utilization appear to be key predictors of RSV-related hospitalization. Future work will assess the relevance of these findings for RSV vaccination recommendations.
METHODS: This study used a US-based research network of 66 Healthcare Organizations (HCOs) and 116 million patients, developed and maintained by TriNetX. The cohort included 43,423 adults 59 years old or older with RSV infection diagnosis between January 1st, 2019, and June 30th, 2024. We used Gradient Boosting ML algorithm (80% / 20% training to test split) with 232 demographics, diagnoses, procedures, laboratory, and medications variables selected by the clinical research team to train the model. SHAP (Shapley Additive Explanations) explored model decisions and assigned relative feature importance.
RESULTS: Over a third (34.2%) of the patients were admitted to the hospital within one day after the initial diagnosis. SHAP analysis revealed that resource utilization patterns are more significant predictors of RSV-related hospitalizations than age and chronic conditions. Patients with a history of hospitalizations and lack of outpatient visits were more likely to be hospitalized (variables ranked 1st and 2nd based on SHAP feature importance, with values 0.306 and 0.281, respectively).
CONCLUSIONS: Patterns of healthcare utilization appear to be key predictors of RSV-related hospitalization. Future work will assess the relevance of these findings for RSV vaccination recommendations.
Conference/Value in Health Info
2025-05, ISPOR 2025, Montréal, Quebec, CA
Value in Health, Volume 28, Issue S1
Code
MSR130
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
No Additional Disease & Conditions/Specialized Treatment Areas, SDC: Geriatrics, SDC: Infectious Disease (non-vaccine)