Q-Squared as a Mapping Model Performance Metric in Gastrointestinal Disease
Author(s)
Kristian Mallon, BSc, MPhil, MSc1, Richeal Maria Burns, MSc PhD1, Rositsa Koleva-Kolarova, BSc, MSc, PhD2, yaling yang, PhD3, Helen Dakin, MSc DPhil2.
1Atlantic Technological University, Sligo, Ireland, 2Health Economics Research Centre, University of Oxford, Oxford, United Kingdom, 3University of Oxford, Oxford, United Kingdom.
1Atlantic Technological University, Sligo, Ireland, 2Health Economics Research Centre, University of Oxford, Oxford, United Kingdom, 3University of Oxford, Oxford, United Kingdom.
OBJECTIVES: Model performance in mapping studies is often assessed using metrics such as root mean square error (RMSE), mean squared error (MSE), and R-squared. Recently, the Q-squared statistic has been used as a test for prognostic accuracy, however to our knowledge, it has not been used in any study mapping between quality of life measures. We aimed to estimate Q-squared for a subset of published mapping algorithms and evaluate its usefulness and appropriateness as a metric for prediction accuracy and model performance.
METHODS: We used the HERC Database of Mapping Studies to review published data mapping patient-reported outcome measures instruments to EQ-5D and other target instruments in gastrointestinal disease (GID) populations. Data were extracted on study year, GID type, source and target instruments, RMSE, MSE, standard deviation [SD], mapping model, and model validation. We calculated Q-squared as either 1-RMSE/SD-squared or 1-MSE/SD-squared.
RESULTS: Among the included studies (n=9), none reported Q-squared. Seven studies reported appropriate metrics for manual calculation of Q-squared: 6 provided data on RMSE and SD, and one reported data on MSE and SD. Median Q-squared across included studies was 0.53 (interquartile range: 0.31). Q-squared was highest in studies applying ordinary least squared regression with a median Q-squared of 0.68 (range: 0.21, 0.95), and lowest in studies applying censored least absolute deviations with a median Q-squared of 0.33 (range: 0.11, 0.54). The highest Q-squared values were reported in studies mapping the EORTC QLQ-C30 to SF-6D (Q-squared=0.95) and EQ-5D (Q-squared=0.81). Q-squared values differed by validation method, with higher median values (0.53; range: 0.37, 0.72) observed in internal validation samples vs external validation samples (median Q-squared= 0.28; range: 0.11, 0.95).
CONCLUSIONS: Incorporating Q-squared into future mapping studies could improve assessment of model performance given its ability to directly assess predictive accuracy, in a way that can be directly compared between samples and instruments.
METHODS: We used the HERC Database of Mapping Studies to review published data mapping patient-reported outcome measures instruments to EQ-5D and other target instruments in gastrointestinal disease (GID) populations. Data were extracted on study year, GID type, source and target instruments, RMSE, MSE, standard deviation [SD], mapping model, and model validation. We calculated Q-squared as either 1-RMSE/SD-squared or 1-MSE/SD-squared.
RESULTS: Among the included studies (n=9), none reported Q-squared. Seven studies reported appropriate metrics for manual calculation of Q-squared: 6 provided data on RMSE and SD, and one reported data on MSE and SD. Median Q-squared across included studies was 0.53 (interquartile range: 0.31). Q-squared was highest in studies applying ordinary least squared regression with a median Q-squared of 0.68 (range: 0.21, 0.95), and lowest in studies applying censored least absolute deviations with a median Q-squared of 0.33 (range: 0.11, 0.54). The highest Q-squared values were reported in studies mapping the EORTC QLQ-C30 to SF-6D (Q-squared=0.95) and EQ-5D (Q-squared=0.81). Q-squared values differed by validation method, with higher median values (0.53; range: 0.37, 0.72) observed in internal validation samples vs external validation samples (median Q-squared= 0.28; range: 0.11, 0.95).
CONCLUSIONS: Incorporating Q-squared into future mapping studies could improve assessment of model performance given its ability to directly assess predictive accuracy, in a way that can be directly compared between samples and instruments.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
MSR176
Topic
Economic Evaluation, Methodological & Statistical Research, Study Approaches
Disease
Gastrointestinal Disorders