Ai-Augmented Data Extraction in Literature Reviews: Toward First-Pass Accuracy Competitive With Human Performance

Author(s)

Ross De Burgh, PhD1, Karl Moritz Herrmann, PhD2, Sam Mardini, BSc3, Christoph R Schlegel, PhD4.
1MedScope Review Solutions, Theoule Sur Mer, France, 2Reliant AI Europe GmbH, Berlin, Germany, 3Reliant AI Inc., Boston, MA, USA, 4Co-founder, Reliant AI Europe GmbH, Berlin, Germany.
OBJECTIVES: AI tools for literature reviews are well known for improving efficiency, but questions around accuracy and repeatability remain key barriers to adoption in regulatory and scientific settings. Human first-pass data extraction is never 100% accurate due to inherent variability (typos, interpretation errors). Demonstrating that AI can outperform or match human accuracy at first pass is critical for driving trust and adoption. Our objective was to evaluate the first-pass accuracy of AI-based structured data extraction (Reliant Tabular) vs first-pass human extraction in systematic literature reviews.
METHODS: A standardized literature review protocol was developed with predefined structured data fields (e.g., sample sizes, outcome measures, study design elements). First-pass extractions were performed using both Reliant Tabular and human reviewers. Each output was compared to a gold-standard dataset (manually validated). Accuracy rates and inter-rater agreement were calculated for both extraction methods.
RESULTS: AI-based extraction demonstrates accuracy comparable to that of human reviewers across key structured fields in the literature review. Human first-pass extraction typically achieves an accuracy rate of approximately 80-95%. The AI system approaches this range, with most discrepancies arising from challenges in entity resolution and handling ambiguous phrasing. In contrast, human reviewers most frequently made errors due to typographical mistakes, omissions, or inconsistent interpretations of study elements.
CONCLUSIONS: AI-based data extraction not only accelerates literature workflows but can exceed human first-pass accuracy, offering a strong foundation for scalable, repeatable, regulator-friendly evidence generation.

Conference/Value in Health Info

2025-11, ISPOR Europe 2025, Glasgow, Scotland

Value in Health, Volume 28, Issue S2

Code

MSR23

Topic

Health Policy & Regulatory, Health Technology Assessment, Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×