Application of Machine Learning to Predict Patient Engagement in Shared Decision Making in Cancer Care
Author(s)
Pavan Kumar Narapaka, PhD1, GOUDICHERLA MANASA, M Pharmacy1, Manisha Singh, MBBS, DNB, ESMO2, Sameer Dhingra, PhD1.
1Department of Pharmacy Practice, National Institute of Pharmaceutical Education and Research, Hajipur, India, 2Medical Oncology, Mahavir Cancer Sansthan and Research Centre, Patna, India.
1Department of Pharmacy Practice, National Institute of Pharmaceutical Education and Research, Hajipur, India, 2Medical Oncology, Mahavir Cancer Sansthan and Research Centre, Patna, India.
OBJECTIVES: To develop a machine learning (ML) based prediction model to estimate the likelihood of patient involvement in shared decision-making (SDM) using clinical and demographic variable in oncology care settings.
METHODS: A cross-sectional study was conducted at a tertiary care hospital, India, involving 384 cancer inpatients. The SDM-Q-9 questionnaire was used to evaluate SDM participation. R 4.4.2 was used to train supervised models in machine learning, including Logistic Regression, Random Forest, Support Vector Machine (SVM), and XGBoost, following preprocessing (e.g. one-hot encoding, outlier identification). Accuracy, precision, recall, F1-score, and AUC were used to assess the models. Model interpretability and feature importance were measured using SHAP values.
RESULTS: The Random Forest model has the best predictable performance, with an F1-score of 0.917, and AUC of 0.962, 89.2% recall, 0.943 precision, and 90.5% accuracy. XG Boost performed well, with an accuracy of 93.7% and an AUC of 0.935. Key predictors included age, cancer type, education level, and gender. The significance of these features in model decision-making was validated by SHAP analysis. There were no concerns to missing data or class imbalance in the dataset.
CONCLUSIONS: ML models correctly predicted patient involvement in SDM, especially Random Forest and XGBoost. By providing proactive tools to assist oncologists in customizing communication strategies these models support patient-centered and value-based care. Further validation across multiple centers and incorporation of psychosocial factors is recommended to improve the clinical importance.
METHODS: A cross-sectional study was conducted at a tertiary care hospital, India, involving 384 cancer inpatients. The SDM-Q-9 questionnaire was used to evaluate SDM participation. R 4.4.2 was used to train supervised models in machine learning, including Logistic Regression, Random Forest, Support Vector Machine (SVM), and XGBoost, following preprocessing (e.g. one-hot encoding, outlier identification). Accuracy, precision, recall, F1-score, and AUC were used to assess the models. Model interpretability and feature importance were measured using SHAP values.
RESULTS: The Random Forest model has the best predictable performance, with an F1-score of 0.917, and AUC of 0.962, 89.2% recall, 0.943 precision, and 90.5% accuracy. XG Boost performed well, with an accuracy of 93.7% and an AUC of 0.935. Key predictors included age, cancer type, education level, and gender. The significance of these features in model decision-making was validated by SHAP analysis. There were no concerns to missing data or class imbalance in the dataset.
CONCLUSIONS: ML models correctly predicted patient involvement in SDM, especially Random Forest and XGBoost. By providing proactive tools to assist oncologists in customizing communication strategies these models support patient-centered and value-based care. Further validation across multiple centers and incorporation of psychosocial factors is recommended to improve the clinical importance.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
PCR17
Topic
Health Service Delivery & Process of Care, Methodological & Statistical Research, Patient-Centered Research
Topic Subcategory
Patient Engagement
Disease
Oncology