A Comparison of Methods for Missing Covariates in a Meta-Regression Using Data From a Systematic Review on Oral Epithelial Dysplasia
Author(s)
Yeonggyeong Jang, BS1, Areum Han, BS1, Hee-Kyung Park, DDS, PhD2, SEOKYUNG HAHN, PhD3.
1Interdisciplinary Program of Medical Informatics, Seoul National University, Seoul, Korea, Republic of, 2Department of Oral Medicine and Oral Diagnosis, School of Dentistry and Dental Research Institute, Seoul National University, Seoul, Korea, Republic of, 3Seoul National University College of Medicine, Seoul, Korea, Republic of.
1Interdisciplinary Program of Medical Informatics, Seoul National University, Seoul, Korea, Republic of, 2Department of Oral Medicine and Oral Diagnosis, School of Dentistry and Dental Research Institute, Seoul National University, Seoul, Korea, Republic of, 3Seoul National University College of Medicine, Seoul, Korea, Republic of.
OBJECTIVES: Missing data is a common issue in meta-analyses. This problem can be more critical in meta-regression, which usually involves a number of covariates and therefore encounters more studies being omitted due to missing covariates. The aim of this study was to explore methods for handling missing covariates.
METHODS: We used data from a systematic review that included 54 studies and 10 study-level covariates. Firstly, we visualized the missing pattern using a plot to understand the distribution and patterns of the missing covariates. We then constructed a conventional multiple random-effects model through manual backward elimination. We employed three methods to analyze the full dataset: A Bayesian random-effects model; multiple imputation using MICE (multivariate imputation by chained equations); and a full information maximum likelihood (FIML) model. We compared the results in terms of regression coefficients and p-values.
RESULTS: Conventional complete case analysis only considered 11 (20%) of the 54 studies. The model included severity, dysplasia site, smoking status and follow-up as covariates. The coefficients of all the covariates indicated a positive relationship to the malignant transformation rate from oral epithelial dysplasia, and their p-values were all less than 0.05. The coefficients estimated by Bayesian and FIML models were very similar to those from the original analysis; however, two covariates were no longer statistically significant. When the MICE was employed, the size of all the coefficients decreased by at most half, while one covariate lost its statistical significance. Smoking status was the covariate that lost statistical significance after application of all methods and had missing values in more than half of the studies in the dataset.
CONCLUSIONS: We practically demonstrated that multiple meta-regression can base on a very small proportion of data due to missing covariates including clinically important ones, and that the results can differ depending on how these missing covariates are dealt with.
METHODS: We used data from a systematic review that included 54 studies and 10 study-level covariates. Firstly, we visualized the missing pattern using a plot to understand the distribution and patterns of the missing covariates. We then constructed a conventional multiple random-effects model through manual backward elimination. We employed three methods to analyze the full dataset: A Bayesian random-effects model; multiple imputation using MICE (multivariate imputation by chained equations); and a full information maximum likelihood (FIML) model. We compared the results in terms of regression coefficients and p-values.
RESULTS: Conventional complete case analysis only considered 11 (20%) of the 54 studies. The model included severity, dysplasia site, smoking status and follow-up as covariates. The coefficients of all the covariates indicated a positive relationship to the malignant transformation rate from oral epithelial dysplasia, and their p-values were all less than 0.05. The coefficients estimated by Bayesian and FIML models were very similar to those from the original analysis; however, two covariates were no longer statistically significant. When the MICE was employed, the size of all the coefficients decreased by at most half, while one covariate lost its statistical significance. Smoking status was the covariate that lost statistical significance after application of all methods and had missing values in more than half of the studies in the dataset.
CONCLUSIONS: We practically demonstrated that multiple meta-regression can base on a very small proportion of data due to missing covariates including clinically important ones, and that the results can differ depending on how these missing covariates are dealt with.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
MSR1
Topic
Clinical Outcomes, Methodological & Statistical Research, Real World Data & Information Systems
Topic Subcategory
Missing Data
Disease
Oncology