Self-reported health status is often measured using psychometric or utility indices that provide a score intended to summarize an individual's health. Measurements of health status can be subject to a ceiling effect. Frequently, researchers want to examine relationships between determinants of health and measures of health status. Regression methods that ignore the censoring in the health status measurement can produce biased coefficient estimates. The authors examine the performance of three different models for assessing the relationship between demographic characteristics and health status.
Three methods that allow one to analyze data subject to a ceiling effect are compared. The first model is the classic Tobit model. The second and third models are robust variants of the Tobit model: symmetrically trimmed least squares and censored least absolute deviations (Censored LAD) regression. These models were fit to data from the Canadian National Population Health Survey. The results are compared to three models that ignore the presence of a ceiling effect.
The Censored LAD model produced coefficient estimates that tended to be shrunk toward 0, compared to the other two models. The three models produced conflicting evidence on the effect of gender on health status. Similarly, the rate of decay in health status with increasing age differed across the three models. The Censored LAD model produced results very similar to median regression. Furthermore, the censored LAD model had the lowest prediction error in an independent validation dataset.
Our results highlight the need for careful consideration about how best to model variation in health status. Based upon our study, we recommend the use of Censored LAD regression.