Evaluating GenAI for HTA Analogue Analysis: Identifying Rare Disease Approvals With Surrogate Endpoints and Single-Arm Trials

Author(s)

Rachel Beckerman, PhD1, Charles Davis, BA1, Sylwia Lach, MPH2, Chloe Shepphard, MPH3.
1Maple Health Group LLC, New York, NY, USA, 2Maple Health Group, LLC, Cracow, Poland, 3Maple Health Group LLC, London, United Kingdom.
OBJECTIVES: Analogue analyses are crucial for understanding drivers and barriers to positive health technology assessments (HTA), and are particularly important in rare disease, where specific constraints often necessitate reliance on surrogate endpoints and single-arm studies. We evaluated the extent to which different generative AI (genAI) models can replicate human-led analogue analyses by identifying therapies that met the following criteria: (1) rare disease indication, (2) surrogate primary endpoint, (3) single-arm pivotal trial, (4) available HTA decision in at least one major jurisdiction (e.g., NICE, G-BA, HAS); (5) approval by the EMA post-January 1, 2020.
METHODS: A detailed prompt was engineered to address the research question; the performance of three leading genAI models (GPT-4, Claude 3, Gemini 1.5) were benchmarked against a reference analogue set meeting the five criteria, previously curated by experienced consultants. Each model was prompted to identify all qualifying therapies meeting the five criteria. Outputs were assessed for accuracy, interpretability, and completeness. Task duration and estimated cost between genAI and human workflows were compared.
RESULTS: All genAI models successfully identified a subset of the analogues in the reference analogue set, but differed substantially in the number of analogues correctly identified as fulfilling all criteria. No genAI analogue set was as accurate as the human workflow set: genAI models frequently included analogues that did not satisfy all criteria, with false inclusions often linked to misclassification of endpoints or regulatory approval date. GenAI models also frequently excluded analogues that were included by human workflow. However, the genAI-assisted workflow reduced both research time and cost.
CONCLUSIONS: GenAI has strong potential to augment consultants’ HTA analogue analyses, though final curation by human experts remains essential to ensure accuracy. Use of genAI models can improve the efficiency of HTA analogue analyses and streamline the identification of relevant analogues to inform insights on HTA drivers and barriers.

Conference/Value in Health Info

2025-11, ISPOR Europe 2025, Glasgow, Scotland

Value in Health, Volume 28, Issue S2

Code

HTA131

Topic

Health Technology Assessment

Topic Subcategory

Decision & Deliberative Processes

Disease

Rare & Orphan Diseases

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×