COMPARING OUTCOME MODELING STRATEGIES FOR ZERO-INFLATED ABSENTEEISM IN REAL-WORLD DATA: IMPLICATIONS FOR INFERENCE ON ANXIETY SEVERITY
Author(s)
Adam K. Jauregui, M.S. in Biostatistics;
Oracle Life Sciences, Austin, TX, USA
Oracle Life Sciences, Austin, TX, USA
OBJECTIVES: Work absenteeism is a bounded and often skewed, zero-inflated outcome. Thus, negative binomial (NB) regression is commonly used to model it in real-world evidence (RWE) studies. This study evaluated whether ordered beta (OB) regression, a Bayesian technique that models outcomes on the [0, 1] scale, yields substantively smaller errors when modeling the association between anxiety severity and absenteeism compared with NB.
METHODS: Cross-sectional data from the 2023 US National Health and Wellness Survey, an online survey of the general US adult population, were used to identify employed respondents with a physician-diagnosed anxiety condition. Symptom severity was categorized using Generalized Anxiety Disorder-7 (GAD-7) scores (0-4=Minimal, 5-9=Mild, 10-14=Moderate, 15-21=Severe). Absenteeism was captured as a percentage of work hours absent in the past 7 days. Three regression models were estimated: Gaussian, NB, and OB; with absenteeism rescaled to [0, 1] for the OB model. All models were adjusted for Patient Health Questionnaire-9 scores, Charlson Comorbidity Index, age, sex, education, and income. Model comparisons focused on average marginal effects (AMEs), calibration plots, and root mean square error (RMSE).
RESULTS: Analyses included 9,202 respondents. (GAD-7 severity: minimal=28%, mild=35%, moderate=22%, and severe=15%). Zero absenteeism was reported among 63% of respondents. Across all models, higher GAD-7 levels was associated with increased absenteeism. AME values were also largely consistent, however the Gaussian model AME for “Mild vs. Minimal” was smaller (0.63) than AMEs from the OB (2.58) and NB (2.06) model. RMSE values were identical across models for less severe anxiety but higher for NB at the “Moderate” (27.7) and “Severe” (27.4) levels when compared with the Gaussian (22.7; 25.8) and OB (22.8; 26.0) models. Calibration plots also revealed overprediction by the NB model for higher GAD-7 levels.
CONCLUSIONS: OB regression demonstrated competitive AMEs, calibration, and prediction errors, supporting its strength in modeling [0, 1] scale, zero-inflated outcomes in RWE studies.
METHODS: Cross-sectional data from the 2023 US National Health and Wellness Survey, an online survey of the general US adult population, were used to identify employed respondents with a physician-diagnosed anxiety condition. Symptom severity was categorized using Generalized Anxiety Disorder-7 (GAD-7) scores (0-4=Minimal, 5-9=Mild, 10-14=Moderate, 15-21=Severe). Absenteeism was captured as a percentage of work hours absent in the past 7 days. Three regression models were estimated: Gaussian, NB, and OB; with absenteeism rescaled to [0, 1] for the OB model. All models were adjusted for Patient Health Questionnaire-9 scores, Charlson Comorbidity Index, age, sex, education, and income. Model comparisons focused on average marginal effects (AMEs), calibration plots, and root mean square error (RMSE).
RESULTS: Analyses included 9,202 respondents. (GAD-7 severity: minimal=28%, mild=35%, moderate=22%, and severe=15%). Zero absenteeism was reported among 63% of respondents. Across all models, higher GAD-7 levels was associated with increased absenteeism. AME values were also largely consistent, however the Gaussian model AME for “Mild vs. Minimal” was smaller (0.63) than AMEs from the OB (2.58) and NB (2.06) model. RMSE values were identical across models for less severe anxiety but higher for NB at the “Moderate” (27.7) and “Severe” (27.4) levels when compared with the Gaussian (22.7; 25.8) and OB (22.8; 26.0) models. Calibration plots also revealed overprediction by the NB model for higher GAD-7 levels.
CONCLUSIONS: OB regression demonstrated competitive AMEs, calibration, and prediction errors, supporting its strength in modeling [0, 1] scale, zero-inflated outcomes in RWE studies.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
MSR206
Topic
Methodological & Statistical Research
Topic Subcategory
PRO & Related Methods