MACHINE LEARNING-BASED PREDICTION OF 10-YEAR CORONARY HEART DISEASE RISK AND COMPARISON WITH THE FRAMINGHAM RISK SCORE

Author(s)

Haeseon Lee, MSc, PharmD, Xiangyang Ye, MA, MS, PhD, Kwame Kissi-Twum, MSc, Nathorn Chaiyakunapruk, PharmD, PhD;
University of Utah, Salt Lake City, UT, USA

OBJECTIVES: The office-based Framingham Risk Score (FRS) is widely used to estimate 10-year coronary heart disease (CHD) risk because of simplicity and reliance on a limited set of variables. However, this parsimony may limit its ability to leverage additional clinical information routinely collected in practice. It remains uncertain whether modest extensions of FRS predictor set using such available variables, when modeled with machine learning (ML) algorithms, translate into meaningful improvements in practical risk prediction. We aimed to compare ML models with FRS for 10-year CHD risk prediction.
METHODS: A total of 4,240 participants were included, of whom 644 (15.2%) experienced CHD within 10 years. Baseline variables included traditional FRS predictors along with additional measures, including heart rate, glucose, smoking exposure, and education. Six classification models were explored: Logistic Regression, Random Forest (RF), SVM, XGBoost, LightGBM, and CatBoost. Class imbalance was addressed via downsampling during training. Models was assessed using discrimination metrics and classification performance at a clinically relevant 20% risk threshold, with secondary analyses using F1-score optimized values.
RESULTS: Across ML models, ROC-AUC ranged from 0.64 to 0.67, compared with 0.70 (95% CI 0.65-0.75) for FRS. At 20% threshold, RF achieved higher sensitivity for identifying high-risk individuals than FRS, but with lower specificity and precision. FRS achieved higher accuracy and the highest F1-score, reflecting a more balanced profile. When thresholds were optimized to maximize F1-score, performance differences were attenuated, but resulting thresholds varied and lacked clinical interpretability. Shapley Additive Explanations of RF identified age, systolic blood pressure, hypertension, total cholesterol, and sex as the most influential predictors.
CONCLUSIONS: Modest extensions of traditional cardiovascular risk factors may be insufficient to achieve clinically superior risk prediction, even with advanced ML approaches. The sustained performance of FRS underscores the robustness of established risk frameworks and suggests that future improvements will require fundamentally novel predictors.

Conference/Value in Health Info

2026-05, ISPOR 2026, Philadelphia, PA, USA

Value in Health, Volume 29, Issue S6

Code

MSR1

Topic

Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

No Additional Disease & Conditions/Specialized Treatment Areas, SDC: Cardiovascular Disorders (including MI, Stroke, Circulatory)

Presentation (CTI)