IDENTIFYING ONCOLOGY PATIENT SUBTYPES USING ELECTRONICPATIENT-REPORTED OUTCOMES (EPROS)DEPLOYED INCLINICAL PRACTICE: A PATIENT-CENTRIC APPROACH TO MACHINE LEARNING
Author(s)
Nadia Still, DNP, DNP, Keri Collette, Other, Mordecai Kramer, MBA, Debra Wujcik, PhD, RN, FAAN;
Health Catalyst, Carevive, South Jordan, UT, USA
Health Catalyst, Carevive, South Jordan, UT, USA
OBJECTIVES: Digital platforms allow clinicians to leverage ePRO’s for prediction of at-risk patients and enable patient-centric intervention. This study applied a machine learning method to ePRO’s to identify patient groups with high treatment burden.
METHODS: 314 eligible patients were undergoing treatment for cancer and enrolled in an ePRO registry from 9/2020 to 4/2025. Social determinants of health (SDoH), weekly treatment bother (TB) by FACT-GP5, frailty, and PRO-CTCAE derived symptoms were collected. A Random Forest (RF) model was trained with symptom alerts at first therapy cycle (SA) as a target variable. Validity was confirmed through 10-fold cross-validation (CV), with AUC of 0.78. Recursive Feature Elimination with CV yielded symptom count (Burden) and TB (Bother) as top features and spectral clustering identified distinct patient subgroups.
RESULTS: Meaningful subtypes along a matrix of Burden and Bother were identified. Subtype one (n=134, 74% SA) showed Low Burden/Low Bother, with median symptoms of 2 and GP5 score 1 at baseline. Subtype two (n=93, 88% SA) was High Burden/Low Bother, with high median SXs at baseline (4) yet GP5 score 2.85% of patients reported caregiver absence suggesting a need for social work referral. Subtype three (n=45, 98% SA), or High Burden/High Bother A, had high median symptoms at baseline (5) and GP5 score 3. About 80% had no caregiver and 51% lived >10 miles from cancer center, indicating a role for support to complete treatment and manage symptoms. Subtype four (n=42, 100% SA), or High Burden/High Bother B, had high median symptoms at baseline (8) and GP5 score 3.74% of patients had no caregiver and 55% lived >10 miles, suggesting a role for intensive monitoring and care escalation.
CONCLUSIONS: Clustering yielded a straightforward, intuitive patient typing approach using a Burden/Bother attribute matrix, with immediate opportunities for tailored, timely outreach to reduce patient burden.
METHODS: 314 eligible patients were undergoing treatment for cancer and enrolled in an ePRO registry from 9/2020 to 4/2025. Social determinants of health (SDoH), weekly treatment bother (TB) by FACT-GP5, frailty, and PRO-CTCAE derived symptoms were collected. A Random Forest (RF) model was trained with symptom alerts at first therapy cycle (SA) as a target variable. Validity was confirmed through 10-fold cross-validation (CV), with AUC of 0.78. Recursive Feature Elimination with CV yielded symptom count (Burden) and TB (Bother) as top features and spectral clustering identified distinct patient subgroups.
RESULTS: Meaningful subtypes along a matrix of Burden and Bother were identified. Subtype one (n=134, 74% SA) showed Low Burden/Low Bother, with median symptoms of 2 and GP5 score 1 at baseline. Subtype two (n=93, 88% SA) was High Burden/Low Bother, with high median SXs at baseline (4) yet GP5 score 2.85% of patients reported caregiver absence suggesting a need for social work referral. Subtype three (n=45, 98% SA), or High Burden/High Bother A, had high median symptoms at baseline (5) and GP5 score 3. About 80% had no caregiver and 51% lived >10 miles from cancer center, indicating a role for support to complete treatment and manage symptoms. Subtype four (n=42, 100% SA), or High Burden/High Bother B, had high median symptoms at baseline (8) and GP5 score 3.74% of patients had no caregiver and 55% lived >10 miles, suggesting a role for intensive monitoring and care escalation.
CONCLUSIONS: Clustering yielded a straightforward, intuitive patient typing approach using a Burden/Bother attribute matrix, with immediate opportunities for tailored, timely outreach to reduce patient burden.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
MSR17
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
No Additional Disease & Conditions/Specialized Treatment Areas, SDC: Oncology