Can Machine Learning Assess the Internal Predictive Validity of Decision Analytic Models? Quantifying the Predictive Power of Input Parameters on Output Estimates
Speaker(s)
ABSTRACT WITHDRAWN
OBJECTIVES: The structure of decision-analytic models implicitly assumes that inputs (e.g., utilities, and state costs) are correlated with outputs (e.g., quality-adjusted life years and costs). However, some inputs will have a greater impact on outputs than others. For complex models, it can be difficult to determine which input parameters have the most predictive power. Machine learning (ML) offers a computationally efficient, data-driven approach to critique the predictive power of model inputs on output values. ML-based models are versatile tools as they can reduce computational time for complex models with new data, they can also be used as a model validation/diagnostic tool.
METHODS: We propose a systematic ML-based approach starting with simple (regression) to more complex techniques, specifically: linear regression; LASSO; Gaussian Process Regression; Deep Gaussian Process Regression; and Random Forests. We applied these approaches to the Sheffield Type 2 Diabetes Treatment Model version 3, a complex patient-level simulation with 390 input parameters. 2000 Probabilistic Sensitivity Analysis (PSA) iterations were available, from which a decision on mean cost-effectiveness was conclusive. Incremental net monetary benefit (IMB) was analyzed alongside all inputs. The dataset was split into a test-dataset to estimate the model(s) and training-dataset to assess predictive power using graphical and statistical (mean absolute error, and root mean square error) techniques.
RESULTS: When the ML techniques were initially applied the model diagnostics indicated they had poor predictive power, after re-running the model with 10,000 PSA iterations the model fits improved substantially. The LASSO estimates also showed which input parameters matter most for IMB, creating the possibility for further model refinement.
CONCLUSIONS: The number of PSA iterations had to be increased to produce an ML-based model that could better identify small output differences, remove/change input states/values, and check for model programming errors.
Code
HTA393
Topic
Health Technology Assessment, Study Approaches
Topic Subcategory
Decision & Deliberative Processes, Decision Modeling & Simulation
Disease
No Additional Disease & Conditions/Specialized Treatment Areas