Machine Learning Insights Into COVID-19 Variant Spread Across US Regions

Author(s)

Hu L1, Zhang X1, Weimer I2, Yapici HO1, Shenoy A1, Lodaya K1, D'Souza F1
1Boston Strategic Partners, Inc., Boston, MA, USA, 2Boston Strategic Partners, Inc., Saint Paul, MN, USA

Presentation Documents

OBJECTIVES: This study uses machine learning techniques, specifically random forest regression, to analyze the influence of various regional or temporal factors on the percentage of key COVID-19 variants. This study aims to uncover predictors of variant prevalence and contribute to a more data-driven approach to pandemic management.

METHODS: The study integrated data from the National COVID Cohort Collaborative (N3C), The Bureau of Transportation Statistics, World Weather Online, the United States Environmental Protection Agency, and US Census data to create a comprehensive predictive model. Random forest regression was used to analyze how different factors impact the spread of COVID-19 variants such as Delta, Omicron BA.5, Alpha, and XBB.1.5 across various U.S. regions, and how sensitive the variants are to different factors.

RESULTS: The model demonstrated high predictive accuracy, with R² values of 0.99 for Delta and BA.5, 0.98 for Alpha, and 0.97 for XBB.1.5, significantly surpassing the 0.78 R² value for a mixed-variant baseline. It revealed that the spread of Delta correlated strongly with ozone density, BA.5 with sun hours and UV index, Alpha with temperature and air quality, and XBB.1.5 with land area and income. These results suggest a complex interplay between environmental factors and variant spread. Studies have also shown that each variant has its favorable environment, for example, BA.5 is not as sensitive as others regarding UV index, and Delta is more sensitive to OZ density but less sensitive to temperature.

CONCLUSIONS: This study shows that machine learning is a useful tool for identifying the multifaceted contributors to COVID-19 variants. The results may shed light on targeted public health interventions and policies, highlighting the vital role of data-driven models during a pandemic.

Conference/Value in Health Info

2024-05, ISPOR 2024, Atlanta, GA, USA

Value in Health, Volume 27, Issue 6, S1 (June 2024)

Code

MSR6

Topic

Epidemiology & Public Health, Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Public Health

Disease

Infectious Disease (non-vaccine), No Additional Disease & Conditions/Specialized Treatment Areas

Explore Related HEOR by Topic


Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×