Machine Learning Models in Prediction of Head and Neck Squamous Cell Carcinoma Survivability

Author(s)

Rohatgi O¹, Agrawal N², Goswami S³, Vivek V², Chaudhuri M², Aparasu RR⁴
¹Complete HEOR Solutions (CHEORS), Jaipur, RJ, India, ²Complete HEOR Solutions (CHEORS), Chalfont, PA, USA, ³Complete HEOR Solutions (CHEORS), Irvine, CA, USA, ⁴University of Houston, College of Pharmacy, Houston, TX, USA

Presentation Documents

ISPOR2023_Rohatgi_Poster[65202]125931.pdf

OBJECTIVES: Predicting cancer survival is important for patient care management. The objective of the study was to evaluate the performance of common machine learning (ML) models in predicting 5-year overall survival of Head and Neck Squamous Cell Carcinoma (HNSCC) based on clinical and demographic prognostic factors.

METHODS: Patients diagnosed with malignant HNSCC between 2008 and 2019 of age≥20 years were identified from the Surveillance, Epidemiology, and End Results (SEER) 17 registries database. Patients having unknown/missing values, or multiple primary sites were excluded. The study evaluated the performance of three ML models - decision trees (DT), random forest (RF), and support vector machine (SVM) along with logistic regression (LR). The predictors included patient demographic and clinical characteristics. for 5-year survival. The classification performance of ML models was evaluated based on accuracy, the area under the receiver operating characteristic curve (AUROC), and the F1 score.

RESULTS: The study cohort included 54,263 patients with HNSCC. Most of the selected patients were White (84.3%), male (74.0%), aged ≥65 years (44.9%), and had oropharyngeal cancer (9.4%). The machine learning algorithms under study were able to predict the overall survival of HNSCC patients. DT outperformed RF, SVM, and LR in terms of AUROC score (DT:72%, RF:70%, SVM:66%, LR:67%). SVM and LR had the highest accuracy followed by DT and RF (SVM:70%, LR:70%, DT:69%, RF:67%) while having comparable F1 scores (SVM:56%, LR:57%, DT:55%, RF:56%).

CONCLUSIONS: Overall, the performance of all ML models varied with better performance with DT based on the AUROC. However, the F1 scores were similar for the three ML models. More work is needed to evaluate the ML models involving external validations for further considerations in healthcare.

Conference/Value in Health Info

2023-05, ISPOR 2023, Boston, MA, USA

Value in Health, Volume 26, Issue 6, S2 (June 2023)

Code

MSR75

Topic

Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Missing Data, PRO & Related Methods

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Explore Related HEOR by Topic

Methodology

Presentation