Utilizing Machine Learning to Predict Survival Outcomes in Multiple Myeloma Patients with Diabetes: Insights from a 20-Year Cohort Study
Author(s)
Junjie Huang1, Chenwen Zhong, PhD2, Martin Chi Sang Wong, MD3.
1Hong Kong, China, 2The Chinese University of Hong Kong, Hong Kong, China, 3The Chinese University of Hong Kong, Hong Kong, Hong Kong.
1Hong Kong, China, 2The Chinese University of Hong Kong, Hong Kong, China, 3The Chinese University of Hong Kong, Hong Kong, Hong Kong.
OBJECTIVES: Multiple myeloma is the third most common hematological malignancy worldwide and is intricately linked with type 2 diabetes mellitus(T2DM). The objective of this study is to identify risk factors influencing the survival of multiple myeloma patients with type 2 diabetes mellitus, compare the performance of various predictive models, and develop a risk score for predicting patient survival probability.
METHODS: Data for this study were collected from the Hong Kong Hospital Authority Data Collaboration Laboratory (HADCL), encompassing a cohort of 986 multiple myeloma patients with T2DM between 2000 and 2020. To analyze the data, four algorithms were employed: Cox Proportional Hazards (CoxPH), CoxNet, Random Survival Forest (RSF), and Survival Trees. The performance of these models was evaluated using metrics such as the Area Under the Curve (AUC) and the concordance index (C-index). Additionally, Shapley Additive Explanations (SHAP) analysis was conducted to identify significant risk factors and to attribute model outputs. A risk score system was developed for risk stratification using the AutoScore-Survival package.
RESULTS: The analysis revealed several significant predictors for the survival of multiple myeloma patients with T2DM. These included the age at cancer diagnosis (HR=1.07, 95% CI [1.07, 1.09], p < .001), the duration of T2DM (HR=1.03, 95% CI [1.01, 1.06], p = .009), and High-Density Lipoprotein Cholesterol (HDL_c) levels (HR=0.37, 95% CI [0.16, 0.86], p = .022). Among the models assessed, the RSF demonstrated the best predictive performance, achieving an AUC of 0.808 and a C-index of 0.90.
CONCLUSIONS: This study successfully identified three significant risk factors associated with the prognosis of multiple myeloma patients with T2DM. A comprehensive range of algorithms was utilized, with the RSF model showing the most robust predictive capability. Furthermore, a validated risk score system was developed, enabling effective patient stratification based on individual risk profiles.
METHODS: Data for this study were collected from the Hong Kong Hospital Authority Data Collaboration Laboratory (HADCL), encompassing a cohort of 986 multiple myeloma patients with T2DM between 2000 and 2020. To analyze the data, four algorithms were employed: Cox Proportional Hazards (CoxPH), CoxNet, Random Survival Forest (RSF), and Survival Trees. The performance of these models was evaluated using metrics such as the Area Under the Curve (AUC) and the concordance index (C-index). Additionally, Shapley Additive Explanations (SHAP) analysis was conducted to identify significant risk factors and to attribute model outputs. A risk score system was developed for risk stratification using the AutoScore-Survival package.
RESULTS: The analysis revealed several significant predictors for the survival of multiple myeloma patients with T2DM. These included the age at cancer diagnosis (HR=1.07, 95% CI [1.07, 1.09], p < .001), the duration of T2DM (HR=1.03, 95% CI [1.01, 1.06], p = .009), and High-Density Lipoprotein Cholesterol (HDL_c) levels (HR=0.37, 95% CI [0.16, 0.86], p = .022). Among the models assessed, the RSF demonstrated the best predictive performance, achieving an AUC of 0.808 and a C-index of 0.90.
CONCLUSIONS: This study successfully identified three significant risk factors associated with the prognosis of multiple myeloma patients with T2DM. A comprehensive range of algorithms was utilized, with the RSF model showing the most robust predictive capability. Furthermore, a validated risk score system was developed, enabling effective patient stratification based on individual risk profiles.
Conference/Value in Health Info
2025-05, ISPOR 2025, Montréal, Quebec, CA
Value in Health, Volume 28, Issue S1
Code
CO145
Topic
Clinical Outcomes
Topic Subcategory
Relating Intermediate to Long-term Outcomes
Disease
SDC: Diabetes/Endocrine/Metabolic Disorders (including obesity), SDC: Musculoskeletal Disorders (Arthritis, Bone Disorders, Osteoporosis, Other Musculoskeletal), SDC: Oncology