REAL-WORLD ENDPOINT MAPPING AND IDENTIFICATION OF EVOLVING PHENOTYPES OF POMPE DISEASE USING MACHINE LEARNING
Author(s)
Edin Guso, M.Sc.1, Atef Zaher, MD2, Kenneth I. Berger, MD3, Vassia Liakoni, PhD1, Martin Beaussart, MSc1, Rory Burke, MSc1, David Castellano Falcon, MSc1, Vahid Esmaeili, PhD1, Leon van Wouwe, MSc1, Christopher M. RUDOLF, -1, Andres Rondon, -4, Kristina An Haack, MD4;
1Volv Global, Épalinges, Switzerland, 2Sanofi, Toronto, ON, Canada, 3Sanofi, Morristown, NJ, USA, 4Sanofi, Gentilly, France
1Volv Global, Épalinges, Switzerland, 2Sanofi, Toronto, ON, Canada, 3Sanofi, Morristown, NJ, USA, 4Sanofi, Gentilly, France
OBJECTIVES: Pompe disease (PD) is a rare, chronically debilitating metabolic disorder. As enzyme replacement therapy (ERT) extends survival, patients may experience new long-term manifestations that are not fully captured by traditional clinical trial endpoints. In addition, many clinical endpoints are not directly observable in real-world claims data. This study assessed whether key PD-related clinical endpoints and emerging disease manifestations can be reliably captured in real-world data (RWD), bridging clinical concepts with readily available claims data.
METHODS: We identified patients with PD (infantile and late onset) in a large U.S. administrative claims database (Komodo) using confirmed diagnosis and/or treatment records. Machine learning models were developed to map 67 pre-defined clinical endpoints to diagnosis, procedure and treatment codes available in claims. In addition, we used a data-driven approach to identify features that distinguished the PD cohort beyond pre-specified endpoints. Prevalence of mapped and newly discovered features was evaluated in the PD cohort and compared with a control population.
RESULTS: A total of 3,038 patients with PD were identified. Machine learning successfully mapped 46 of the 67 pre-defined clinical endpoints to codes present in the claims-based RWD. Additionally, data-driven discovery identified novel distinctive features in the PD cohort including cardiovascular, respiratory and systemic manifestations, as well as patterns of healthcare utilization. All mapped endpoints were significantly more prevalent among patients with PD than in controls. Overall, treated patients had a higher prevalence of both mapped and newly discovered endpoints.
CONCLUSIONS: Our machine learning approach mapped key pre-specified PD clinical endpoints to codes available in claims-based RWD and identified additional disease manifestations that may reflect the evolving phenotypes of PD in the era of disease modifying therapy. This approach provides an opportunity to identify clinically meaningful endpoints for future natural history studies and clinical trials.
METHODS: We identified patients with PD (infantile and late onset) in a large U.S. administrative claims database (Komodo) using confirmed diagnosis and/or treatment records. Machine learning models were developed to map 67 pre-defined clinical endpoints to diagnosis, procedure and treatment codes available in claims. In addition, we used a data-driven approach to identify features that distinguished the PD cohort beyond pre-specified endpoints. Prevalence of mapped and newly discovered features was evaluated in the PD cohort and compared with a control population.
RESULTS: A total of 3,038 patients with PD were identified. Machine learning successfully mapped 46 of the 67 pre-defined clinical endpoints to codes present in the claims-based RWD. Additionally, data-driven discovery identified novel distinctive features in the PD cohort including cardiovascular, respiratory and systemic manifestations, as well as patterns of healthcare utilization. All mapped endpoints were significantly more prevalent among patients with PD than in controls. Overall, treated patients had a higher prevalence of both mapped and newly discovered endpoints.
CONCLUSIONS: Our machine learning approach mapped key pre-specified PD clinical endpoints to codes available in claims-based RWD and identified additional disease manifestations that may reflect the evolving phenotypes of PD in the era of disease modifying therapy. This approach provides an opportunity to identify clinically meaningful endpoints for future natural history studies and clinical trials.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
RWD22
Topic
Real World Data & Information Systems
Disease
SDC: Diabetes/Endocrine/Metabolic Disorders (including obesity), SDC: Musculoskeletal Disorders (Arthritis, Bone Disorders, Osteoporosis, Other Musculoskeletal), SDC: Rare & Orphan Diseases