Leveraging Machine Learning to Assess the Association of Rash and Survival in Patients With Advanced NSCLC
Moderator
Qianyu Yuan, Flatiron Health, Secaucus, NJ, United States
Speakers
Aaron Dolor; Yunzhi Qian; Doug Donnelly; Melissa Estevez; Yulia Kuznetsova; Nisha Singh, MS; Prakirthi Yerram
OBJECTIVES: The association between rash and survival is well-documented for first and second-generation epidermal growth factor receptor tyrosine kinase inhibitors (EGFR TKIs), but less for third-generation. This study leveraged machine learning (ML)-extracted real-world adverse events (rwAEs) to evaluate incidence and association between rash incidence and survival outcomes in patients with non-small cell lung cancer (NSCLC) treated with EGFR TKIs.
METHODS: This study used the nationwide Flatiron Health electronic health record-derived, deidentified database. The study included adults aged ≥18 years with advanced EGFR-mutated NSCLC, treated with 1L EGFR TKI monotherapy between January 2011 and June 2024. A natural language processing model was used to extract rwAEs. Descriptive statistics were used to compare the incidence of 37 rwAEs overall and by TKI generation. Kaplan-Meier and Cox models evaluated the association between rash incidence and real-world overall survival (rwOS) and progression-free survival (rwPFS). This study also evaluated ICD codes and ML extraction, alone and combined, for identifying rash and its relationship with survival outcomes.
RESULTS: 5606 patients were included in the analysis. Compared with first- and second-generation TKIs, third-generation TKIs showed higher incidences of anemia, and QT prolongation and lower rash, aligning with clinical trials. Overall, rash incidence was 51%. Rash was associated with improved rwOS (hazard ratio [HR], 0.75; confidence interval [CI], 0.70-0.81) and rwPFS (HR, 0.85; CI, 0.80-0.90) across all TKI generations, notably with third-generation TKIs (rwOS: HR, 0.64; CI, 0.57-0.72; rwPFS: HR, 0.73; CI, 0.67-0.81). Using ICD codes alone showed lower rash incidence (11%) than combining ML extraction with ICD codes (52%), but survival benefits were consistent across methods.
CONCLUSIONS: The study supports the use of ML to scalably extract rwAEs. With earlier-generation EGFR TKIs, rwAE incidence and rash-related survival benefit aligned with clinical expectations. Third-generation TKIs demonstrate similar survival benefits.
METHODS: This study used the nationwide Flatiron Health electronic health record-derived, deidentified database. The study included adults aged ≥18 years with advanced EGFR-mutated NSCLC, treated with 1L EGFR TKI monotherapy between January 2011 and June 2024. A natural language processing model was used to extract rwAEs. Descriptive statistics were used to compare the incidence of 37 rwAEs overall and by TKI generation. Kaplan-Meier and Cox models evaluated the association between rash incidence and real-world overall survival (rwOS) and progression-free survival (rwPFS). This study also evaluated ICD codes and ML extraction, alone and combined, for identifying rash and its relationship with survival outcomes.
RESULTS: 5606 patients were included in the analysis. Compared with first- and second-generation TKIs, third-generation TKIs showed higher incidences of anemia, and QT prolongation and lower rash, aligning with clinical trials. Overall, rash incidence was 51%. Rash was associated with improved rwOS (hazard ratio [HR], 0.75; confidence interval [CI], 0.70-0.81) and rwPFS (HR, 0.85; CI, 0.80-0.90) across all TKI generations, notably with third-generation TKIs (rwOS: HR, 0.64; CI, 0.57-0.72; rwPFS: HR, 0.73; CI, 0.67-0.81). Using ICD codes alone showed lower rash incidence (11%) than combining ML extraction with ICD codes (52%), but survival benefits were consistent across methods.
CONCLUSIONS: The study supports the use of ML to scalably extract rwAEs. With earlier-generation EGFR TKIs, rwAE incidence and rash-related survival benefit aligned with clinical expectations. Third-generation TKIs demonstrate similar survival benefits.
Conference/Value in Health Info
2025-05, ISPOR 2025, Montréal, Quebec, CA
Value in Health, Volume 28, Issue S1
Code
MSR24
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
SDC: Oncology, STA: Personalized & Precision Medicine