A Machine Learning Model to Facilitate Patient-Level Risk Screening in Myelodysplastic Syndromes (MDS) Patients in Routine Clinical Practice


Vaidya V1, Priya V1, Parmar D2, Yan R3, Das R3, Haririfar M3, McMahon P4, Williamson M4, Hogea C4
1ConcertAI, Bengaluru, KA, India, 2ConcertAI, Bengaluru, India, 3ConcertAI, Cambridge, MA, USA, 4Gilead Sciences, Foster City, CA, USA

OBJECTIVES: The objective of this study was to develop an automated, non-invasive machine learning (ML)–based, risk-screening model for patients with MDS utilizing information commonly collected in routine practice. Such tools could help prompt timely use of existing clinical scoring systems such as the Revised International Prognostic Scoring System (IPSS-R), which involve more complex testing.

METHODS: Patients diagnosed with MDS between January 2015 and February 2022 were identified in ConcertAI’s RWD360 database, consisting of structured records from US-based oncology electronic health record (EHR) and linked open claims. Death date was determined by structured EHR field linked with death registries. Data up to 1 year prior to the date of MDS diagnosis was used to develop an XGBoost model. Key features used included labs, Charlson Comorbidity Index (CCI), and medications. Five-fold cross-validation and Harrell’s Concordance Index (C-Index) were used to evaluate performance.

RESULTS: The model, based on 4309 patients meeting inclusion criteria, had an overall survival (OS) validation set C-Index performance of 0.699, in line with reported IPSS-R literature C-Index (0.57–0.7). Top predictive features besides age and sex included temporal variation (eg, range and change) in platelet count and oxygen saturation. A reduced-feature model version streamlined for clinical utility using eight core features—age, sex, CCI, and number of comorbidities, alongside white blood cell count, platelet count, heart rate, and oxygen saturation level values over time—yielded a comparable validation C-Index of 0.675.

CONCLUSIONS: A data-driven ML model was built that can reasonably predict the OS for patients with MDS on the basis of individual characteristics. This was achieved through integration of comprehensive patient-level data coupled with temporal feature engineering to capture the patient clinical trajectory. Such models have the potential to alert for risk and cue confirmatory testing for timely identification of patients eligible for treatment.

Conference/Value in Health Info

2023-11, ISPOR Europe 2023, Copenhagen, Denmark

Value in Health, Volume 26, Issue 11, S2 (December 2023)




Clinical Outcomes, Methodological & Statistical Research, Study Approaches

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Clinical Outcomes Assessment, Electronic Medical & Health Records


Oncology, Personalized & Precision Medicine

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now