Identifying Risk Profiles for Early Treatment Discontinuation in Geographic Atrophy Using Machine Learning and SHAP Clustering

Author(s)

Ashwin Kumar Rai, MS1, Nick Boucher, BS2, Ariel Berger, MPH3, Fabio Ishii, BS4, Victoria Ikoro, PhD5, Devika Bhandary, MSc5, Andre Ng, MSc5.
1Director of Data Science & Advanced Analytics, Thermo Fisher Scientific, Overland Park, KS, USA, 2Thermo Fisher Scientific, Nepean, ON, Canada, 3Thermo Fisher Scientific, Wilmington, NC, USA, 4Thermo Fisher Scientific, Curitiba, Brazil, 5Thermo Fisher Scientific, London, United Kingdom.
OBJECTIVES: To identify and characterize subgroups of US patients with GA in clinical practice at increased risk of early (≤120 days from initiation) discontinuation of therapy using a ML model combined with SHAP-based clustering.
METHODS: A supervised ML model was developed using real-world data sourced from Vestrum Health, a large (2.5 M patients across >70 sites in the US) retina-focused electronic medical records database. Data from 8,134 patients was used to train the model; it was validated from an independent test set of 2,034 patients. Model performance was evaluated using receiver-operator characteristic (ROC) area under the curve (AUC), Precision-Recall AUC, precision, recall, F1-score, and confusion matrix metrics. Clustering SHAP values enabled development of subgroups (“profiles”) with shared risk factors.
RESULTS: Sixty-four percent of the test cohort discontinued GA therapy within 120 days of initiation. The model achieved a ROC AUC of 0.62, a precision-recall AUC of 0.74, precision of 0.64, recall of 0.98, and overall accuracy of 64%. Key factors associated with early discontinuation included poor baseline vision, bilateral GA, prior anti-vascular endothelial growth factor (VEGF) treatment, and insurance type. Clustering identified five patient profiles, ranging from the highest risk of early discontinuation (3% of the cohort; Profile 4, characterized by foveal involvement) to the lowest risk (26%; Profile 3, characterized by better baseline vision). An intermediate-risk group (31%; Profile 2) was primarily defined by unilateral disease.
CONCLUSIONS: Use of ML and SHAP clustering enables identification of patient profiles that reveal not only who is at risk but also why, highlighting how drivers may differ by risk-based subgroup. Use of RWD to inform these efforts can enable patient-centered strategies to address specific barriers, support shared decision-making, and ultimately improve treatment persistence—and overall patient health—in clinical practice

Conference/Value in Health Info

2025-11, ISPOR Europe 2025, Glasgow, Scotland

Value in Health, Volume 28, Issue S2

Code

MSR126

Topic

Methodological & Statistical Research, Patient-Centered Research, Real World Data & Information Systems

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

No Additional Disease & Conditions/Specialized Treatment Areas, Personalized & Precision Medicine, Sensory System Disorders (Ear, Eye, Dental, Skin)

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×