Early Detection of Cancer Therapy-Related Cardiac Dysfunction in Lung Cancer Patients Using Machine Learning
Author(s)
Wen Li Kuan, II, MD1, Hsiu-Ting Chien, PhD1, Chih-Fan Yeh, PhD2, Sandy Hsu, MD3, Wan Tseng Hsu, PhD1, Fang-Ju (Irene) Lin, RPh, PhD3.
1Graduate Institute of Clinical Pharmacy College of Medicine, National Taiwan University, Taipei, Taiwan, 2Division of Cardiology, Department of Internal Medicine and Cardiovascular Center, National Taiwan University Hospital, Taipei, Taiwan, 3School of Pharmacy, College of Medicine, National Taiwan University, Taipei, Taiwan.
1Graduate Institute of Clinical Pharmacy College of Medicine, National Taiwan University, Taipei, Taiwan, 2Division of Cardiology, Department of Internal Medicine and Cardiovascular Center, National Taiwan University Hospital, Taipei, Taiwan, 3School of Pharmacy, College of Medicine, National Taiwan University, Taipei, Taiwan.
OBJECTIVES: This study aimed to develop a machine learning model for early detection of cancer therapy-related cardiac dysfunction (CTRCD) in patients with lung cancer (LC), integrating electrocardiography (ECG) to enhance detection accuracy.
METHODS: Data were extracted from the electronic health records of LC patients receiving first cancer therapy at a tertiary hospital between 2009 and 2022.CTRCD was defined as new-onset heart failure following initial LC treatment, determined by decline in left ventricular ejection fraction or HF diagnosis during follow-up. All outcomes were adjudicated by a cardiologist. Clinical features were collected up to 30 days before CTRCD onset to build the detection model. Four ML algorithms—Lasso regression, Random Forest, XGBoost, and Naïve Bayes—were applied. Model performance was evaluated using area under the precision-recall curve (AUPRC), accuracy, and recall. SHapley Additive exPlanations (SHAP) analysis was performed to identify key predictors of CTRCD.
RESULTS: A total of 52 patients who developed CTRCD and 341 who did not were included. The Random Forest model with undersampling for class imbalance performed best (AUPRC = 0.652, recall = 1.00), outperforming the model without ECG data (AUPRC = 0.618, recall = 0.90). SHAP analysis identified key features, including lower hemoglobin, higher blood urea nitrogen, and elevated heart rate. ECG features such as mean QRS axis, QTc interval, and supraventricular arrhythmias ranked among the top 20 predictors, highlighting the potential of ECG in early CTRCD detection.
CONCLUSIONS: The ML model effectively identified CTRCD early by integrating clinical and ECG data. Incorporating ECG, a non-invasive and cost-effective tool, into clinical practice may enhance the feasibility and scalability of early CTRCD detection, potentially improving patient outcomes through timely interventions. External validation is required to confirm the model's generalizability.
METHODS: Data were extracted from the electronic health records of LC patients receiving first cancer therapy at a tertiary hospital between 2009 and 2022.CTRCD was defined as new-onset heart failure following initial LC treatment, determined by decline in left ventricular ejection fraction or HF diagnosis during follow-up. All outcomes were adjudicated by a cardiologist. Clinical features were collected up to 30 days before CTRCD onset to build the detection model. Four ML algorithms—Lasso regression, Random Forest, XGBoost, and Naïve Bayes—were applied. Model performance was evaluated using area under the precision-recall curve (AUPRC), accuracy, and recall. SHapley Additive exPlanations (SHAP) analysis was performed to identify key predictors of CTRCD.
RESULTS: A total of 52 patients who developed CTRCD and 341 who did not were included. The Random Forest model with undersampling for class imbalance performed best (AUPRC = 0.652, recall = 1.00), outperforming the model without ECG data (AUPRC = 0.618, recall = 0.90). SHAP analysis identified key features, including lower hemoglobin, higher blood urea nitrogen, and elevated heart rate. ECG features such as mean QRS axis, QTc interval, and supraventricular arrhythmias ranked among the top 20 predictors, highlighting the potential of ECG in early CTRCD detection.
CONCLUSIONS: The ML model effectively identified CTRCD early by integrating clinical and ECG data. Incorporating ECG, a non-invasive and cost-effective tool, into clinical practice may enhance the feasibility and scalability of early CTRCD detection, potentially improving patient outcomes through timely interventions. External validation is required to confirm the model's generalizability.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
MSR79
Topic
Epidemiology & Public Health, Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
Cardiovascular Disorders (including MI, Stroke, Circulatory), Oncology