Comparing Methodologies to Predict Incidence of COVID-19 in US Counties

Author(s)

Coplan P1, Shah S2, Bhardwaj A3, Gurubaran A3, Dwarakanathan H3, Cafri G4, Chitnis A1, Khanna R5, Kakade O3, Nandi B3, Holy C6
1Johnson & Johnson, New Brunswick, NJ, USA, 2Mu Sigma, Bangalore, KA, India, 3Mu Sigma, Bengaluru, KA, India, 4Johnson & Johnson, San Diego, CA, USA, 5Johnson & Johnson Co., New Brunswick, NJ, USA, 6Johnson & Johnson, Somerville, MA, USA

OBJECTIVES:

With the spread of the SARS-CoV-2 virus worldwide, governments have adopted stringent measures to prevent disease spread. As lockdowns are being eased, models to evaluate potential resurgence of disease are increasingly important. The aim of this study is to compare methodologies to predict incidence of COVID-19 for US counties.

METHODS:

Reported number of COVID-19 positive cases were obtained from CDC, Social distancing scores (SDS) from Unacast, Population Density from the US Census data and testing rates obtained from the CDC website. The data assessed was during the period February 28, 2020 to May 28, 2020. Poisson and linear regression models were built to predict the number of reported cases using 1-week lagged SDS, tests per day and population density. Damped Holt linear trend (DHLT) coefficients and moving averages were calculated by using the daily number of cases in the latest 14 days. All the models were built at a county level. The following 4 methodologies were compared: Poisson Regression, Linear Regression, DHLT and simple moving average (SMA). Data from the month of June was used to validate the results.

RESULTS:

US Counties were ranked in terms of annualized incidence of disease from highest to lowest and the top 100 counties were identified for each methodology. Counties that were predicted to be within the top 100 were compared to those that ended up being in the top 100, as per reported counts. The Poisson and linear regressions both correctly identified 45 out of top 100 counties. Whereas SMA and DHLT only identified 36 and 29 counties, respectively.

CONCLUSIONS:

Linear Regression and Poisson regression were the most accurate in predicting high incidence. In our study, confounding factors like usage of masks or changes in behaviors were not included. Further research on these different factors are needed to improve prediction accuracy.

Conference/Value in Health Info

2020-11, ISPOR Europe 2020, Milan, Italy

Value in Health, Volume 23, Issue S2 (December 2020)

Code

PIN87

Topic

Epidemiology & Public Health

Topic Subcategory

Public Health

Disease

Infectious Disease (non-vaccine)

Explore Related HEOR by Topic


Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×