ACCELERATING DYNAMIC HTA LANDSCAPING IN ONCOLOGY THROUGH AUTONOMOUS GENERATIVE AI-DRIVEN MULTILINGUAL DATA EXTRACTION

Author(s)

Manuel Cossio, MPhil, MS1, Lilia Leisle, PhD2;
1Cytel, Director, Artificial Intelligence Lead, Dubendorf, Switzerland, 2Cytel, Berlin, Germany
OBJECTIVES: To develop and evaluate autonomous large language model (LLM)-based agents for structured information extraction from multilingual Health Technology Assessment (HTA) reports to support EU Joint Clinical Assessment (JCA) Population-Intervention-Comparator-Outcome (PICO) simulation, including standard PICO elements and context-specific (CS) HTA evidence.
METHODS: Two sequential LLM-based agents were developed to perform information extraction using 21 expert-generated questions. The extraction covered standard PICO components, including the assessed population, accepted comparators, and outcomes, as well as context-specific elements such as methodological requirements, reasons for non-acceptance of outcomes or comparators, and other critique points reported in HTA documents. Agent 1 used a general prompt, while Agent 2 incorporated additional clarification instructions within selected questions to improve contextual understanding. Performance was evaluated using a custom scoring framework assigning 1 point each for accuracy and completeness. Any response containing hallucinated content received a total score of 0 regardless of accuracy or completeness. The agents were evaluated on publicly available osimertinib HTA reports from Spain (4,678 words), the Netherlands (2,512 words), and France (9,876 words).
RESULTS: Both agents completed the full extraction set across all documents, with approximately 90% of questions answered without hallucinations. Agent 2 outperformed Agent 1, achieving a higher mean number of fully correct responses (16.6 vs. 13.3). The French HTA report showed the highest performance for both agents. Agent 1 generated more partially correct answers (mean 6.6 vs. 5) and was the only agent to produce hallucinated content, observed in the Spanish report.
CONCLUSIONS: Expert-guided prompt refinement substantially improved autonomous extraction of both standard and CS HTA information from multilingual reports. LLM-based agents show promise for scalable HTA data extraction to support EU JCA PICO simulations. However, further methodological refinement and integration of HTA experts as humans-in-the-loop remain essential to reduce hallucinations, verify completeness and accuracy, and ensure reliability in regulatory and HTA applications.

Conference/Value in Health Info

2026-05, ISPOR 2026, Philadelphia, PA, USA

Value in Health, Volume 29, Issue S6

Code

HTA27

Topic

Health Technology Assessment

Topic Subcategory

Decision & Deliberative Processes

Disease

SDC: Oncology

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×