AI for Rapid Data Extraction: A Case Study on the Economic and Caregiver Burden of Motor Neuron Disease

Author(s)

Kim Wager, DPhil1, Tomas Rees, PhD2, Maxandre Jacqueline, MRes3, Jungwon Byun, BA (Econ)4.
1Oxford PharmaGenesis, Tubney, United Kingdom, 2Oxford PharmaGenesis Ltd, Oxford, United Kingdom, 3Oxford Pharmagenesis, Oxford, United Kingdom, 4Elicit, Oakland, CA, USA.
OBJECTIVES: HEOR evidence synthesis is bottlenecked by systematic reviews, which although they are considered the gold-standard, are time-consuming and laborious. New AI tools claim to improve the efficiency of SLRs while maintaining and improving accuracy, but there are limited data on these claims. In this study, we assessed the performance of an artificial intelligence (AI) research platform (Elicit) on a review related to motor neuron disease, a disease with a complicated evidence base.
METHODS: Elicit was used to identify and extract data from studies reporting epidemiology, economic burden, and caregiver impact. Following semantic search and AI-assisted screening with PICO criteria, we performed data extraction across 167 publications using 16 predefined variables. AI performance was compared against a human ground-truth dataset (n = 21 publications), using a scoring rubric assessing completeness, accuracy, detail and uncertainty handling (0-4 points each, total 16).
RESULTS: For simple extraction (population size), AI achieved a mean score of 14.43/16 (90%). In 24% of cases, AI extracted correct values which were incorrectly extracted by the human; in 14%, AI provided additional detail.
For a more complex extraction task (direct costs), AI achieved a mean score of 13.10/16 (82%). AI performed well across all domains but showed variability in completeness (SD 1.32) and uncertainty handling (SD 1.53). In 14% of publications, AI inappropriately inferred results; in 24%, AI reported incorrect/incomplete data.
CONCLUSIONS: AI-assisted evidence extraction shows promise for accelerating HEOR research, achieving a mean score of 90% for simple and 82% for complex extractions. AI successfully handled diverse study designs and sometimes outperformed humans. However, the 24% error rate for complex tasks highlights the need for human oversight and suggests that hybrid AI-human approaches could improve systematic review feasibility and scope while maintaining rigor. Future research should focus on optimizing AI-human collaboration workflows and developing standardized validation frameworks for AI-assisted evidence synthesis.

Conference/Value in Health Info

2025-11, ISPOR Europe 2025, Glasgow, Scotland

Value in Health, Volume 28, Issue S2

Code

SA8

Topic

Methodological & Statistical Research, Study Approaches

Topic Subcategory

Literature Review & Synthesis

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×