Predicting Survival in Multiple Myeloma Patients With Diabetes Using Machine Learning: A 20-Year Cohort Study
Author(s)
Junjie Huang, PhD1, Chenwen Zhong, PhD2, Martin Chi Sang Wong, MD1.
1The Chinese University of Hong Kong, Hong Kong, Hong Kong, 2The Chinese University of Hong Kong, Hong Kong, China.
1The Chinese University of Hong Kong, Hong Kong, Hong Kong, 2The Chinese University of Hong Kong, Hong Kong, China.
OBJECTIVES: Multiple myeloma is 3rd most common hematological malignancy in worldwide, and has a complex mutual correlation with type 2 diabetes mellitus (T2DM). However, there is a lack of study exploring the prognostic risk factors of multiple myeloma patients with T2DM and no study has developed a risk score specifically for the prognosis of multiple myeloma patients with T2DM. Therefore, this study has three main objectives: 1) identify risk factors affecting the survival of multiple myeloma patients with T2DM; 2) compare the performance of different models; 3) and create a risk score to predict the survival probability of patients.
METHODS: We collected data from Hong Kong Hospital Authority Data Collaboration Laboratory (HADCL) including 986 multiple myeloma patients with T2DM between 2000 and 2020. Four algorithms including CoxPH, CoxNet random survival forest (RSF), and Survival tree were utilized. Area under the curve (AUC) and concordance index (C-index) were used to compare their performances. SHAP (Shapley Additive Explanations) analysis was performed for risk factors identification and model output attribution. A risk score system using AutoScore-Survival package was developed for risk stratification.
RESULTS: Significant predictors for multiple myeloma survival among T2DM patients included cancer diagnosis age (HR=1.07, 95%CI [1.07, 1.09], p<.001), T2DM period (HR=1.03, 95%CI [1.01,1.06], p=.009), and HDL_c (HR=0.37, 95%CI [0.16, 0.86], p=0.022). The RSF model have the best predicting performance (AUC 0.808, c-index 0.90). The AUC of 1, 3, and 5 years of tuned risk score system are 0.776, 0.838, and 0.885 and the c-index is 0.725 in test set.
CONCLUSIONS: The research explored three significant risk factors associated with the prognosis of multiple myeloma patients with T2DM. A wide range of algorithms were employed, with RSF model having the best predicting power. A validated risk score system was created for patient stratification based on their risk.
METHODS: We collected data from Hong Kong Hospital Authority Data Collaboration Laboratory (HADCL) including 986 multiple myeloma patients with T2DM between 2000 and 2020. Four algorithms including CoxPH, CoxNet random survival forest (RSF), and Survival tree were utilized. Area under the curve (AUC) and concordance index (C-index) were used to compare their performances. SHAP (Shapley Additive Explanations) analysis was performed for risk factors identification and model output attribution. A risk score system using AutoScore-Survival package was developed for risk stratification.
RESULTS: Significant predictors for multiple myeloma survival among T2DM patients included cancer diagnosis age (HR=1.07, 95%CI [1.07, 1.09], p<.001), T2DM period (HR=1.03, 95%CI [1.01,1.06], p=.009), and HDL_c (HR=0.37, 95%CI [0.16, 0.86], p=0.022). The RSF model have the best predicting performance (AUC 0.808, c-index 0.90). The AUC of 1, 3, and 5 years of tuned risk score system are 0.776, 0.838, and 0.885 and the c-index is 0.725 in test set.
CONCLUSIONS: The research explored three significant risk factors associated with the prognosis of multiple myeloma patients with T2DM. A wide range of algorithms were employed, with RSF model having the best predicting power. A validated risk score system was created for patient stratification based on their risk.
Conference/Value in Health Info
2025-09, ISPOR Real-World Evidence Summit 2025, Tokyo, Japan
Value in Health Regional, Volume 49S (September 2025)
Code
RWD39
Topic Subcategory
Distributed Data & Research Networks
Disease
SDC: Oncology