GENERAL-PURPOSE VS HEOR-SPECIFIC GENERATIVE AI IN RARE DISEASE MODELING: A DUCHENNE MUSCULAR DYSTROPHY CASE STUDY
Author(s)
Sumeyye Samur, PhD1, Turgay Ayer, PhD1, Ismail F. Yildirim, MSc1, Mine Tekman, PhD1, Jag Chhatwal, PhD2;
1Value Analytics Labs, Boston, MA, USA, 2Massachusetts General Hospital/ Harvard Medical School, Boston, MA, USA
1Value Analytics Labs, Boston, MA, USA, 2Massachusetts General Hospital/ Harvard Medical School, Boston, MA, USA
OBJECTIVES: Health economic modeling for rare diseases is constrained by limited clinical and economic evidence, and heterogeneous disease progression. This study evaluated the feasibility of using GenAI to support rare disease model conceptualization by comparing Duchenne muscular dystrophy (DMD) models generated by three AI platforms.
METHODS: Two general-purpose GenAI platforms (ChatGPT 5.2 and Gemini 3 Flash) and one HEOR-specific platform (ValueGen.AI) were assessed. ValueGen.AI is a multi-agent, deep-research system implemented in Python using LangGraph. All platforms were prompted using an identical prompt to propose a DMD health economic model conceptualization. Outputs were compared across ISPOR/HTA-relevant domains, including model structure, DMD milestones and endpoints, treatment effect conceptualization, evidence traceability, validation, and characterization of parameter and structural uncertainty. Alignment with established DMD frameworks (D-RSC/Project HERCULES) was assessed qualitatively.
RESULTS: Across all platforms, a state-transition approach was recommended. ChatGPT and ValueGen.AI converged on a milestone-driven progression framework anchored to key DMD transitions (e.g., loss of ambulation, ventilation dependence) and both included detailed cost and utility components (including caregiver burden). ValueGen.AI uniquely provided HTA-aligned framework with 20+ DMD-specific citations, explicit alignment with D-RSC/Project HERCULES, and a CHEERS-consistent validation plan; it also specified PSA distributions and gene-therapy-relevant treatment effects (curve-shift/durability scenarios). ChatGPT, while similarly implementable, provided no citations, validation targets, or HTA precedent, limiting transparency and reproducibility. Gemini proposed a high-level Markov ladder with broad cost/utility categories but had limited endpoint specificity (minimal pulmonary/cardiac integration), no structured adverse-event module, and no explicit validation or uncertainty framework.
CONCLUSIONS: GenAI can support early-stage rare disease model conceptualization, but HEOR-specialized platforms provide substantially greater transparency, alignment with established frameworks, and HTA relevance than general-purpose models.
METHODS: Two general-purpose GenAI platforms (ChatGPT 5.2 and Gemini 3 Flash) and one HEOR-specific platform (ValueGen.AI) were assessed. ValueGen.AI is a multi-agent, deep-research system implemented in Python using LangGraph. All platforms were prompted using an identical prompt to propose a DMD health economic model conceptualization. Outputs were compared across ISPOR/HTA-relevant domains, including model structure, DMD milestones and endpoints, treatment effect conceptualization, evidence traceability, validation, and characterization of parameter and structural uncertainty. Alignment with established DMD frameworks (D-RSC/Project HERCULES) was assessed qualitatively.
RESULTS: Across all platforms, a state-transition approach was recommended. ChatGPT and ValueGen.AI converged on a milestone-driven progression framework anchored to key DMD transitions (e.g., loss of ambulation, ventilation dependence) and both included detailed cost and utility components (including caregiver burden). ValueGen.AI uniquely provided HTA-aligned framework with 20+ DMD-specific citations, explicit alignment with D-RSC/Project HERCULES, and a CHEERS-consistent validation plan; it also specified PSA distributions and gene-therapy-relevant treatment effects (curve-shift/durability scenarios). ChatGPT, while similarly implementable, provided no citations, validation targets, or HTA precedent, limiting transparency and reproducibility. Gemini proposed a high-level Markov ladder with broad cost/utility categories but had limited endpoint specificity (minimal pulmonary/cardiac integration), no structured adverse-event module, and no explicit validation or uncertainty framework.
CONCLUSIONS: GenAI can support early-stage rare disease model conceptualization, but HEOR-specialized platforms provide substantially greater transparency, alignment with established frameworks, and HTA relevance than general-purpose models.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
MSR225
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
SDC: Musculoskeletal Disorders (Arthritis, Bone Disorders, Osteoporosis, Other Musculoskeletal), SDC: Rare & Orphan Diseases