Exploring Artificial Intelligence's (AI's) Role in Literature Screening: A Comparative Analysis Paving the Way for More Efficient Evidence Synthesis

Author(s)

Gautamjeet Singh Mangat, MSc1, Astha Jain, MSc1, Sugandh Sharma, MSc2, Sangeeta Budhia, PhD3.
1Parexel, Mohali, India, 2Parexel, Chandigarh, India, 3Parexel, London, United Kingdom.
OBJECTIVES: To compare the effectiveness of human-only screening activities versus AI-assisted screening, we evaluated accuracy, recall, precision, and F1 scores as key performance metrics.
METHODS: The human-only approach involved complete screening (titles/abstracts) by a senior reviewer with a 30% quality check. The AI-assisted method used 30% of citations as a human-reviewed training set, with AI screening the remaining 70%. Results from both methods were compared for performance metrics across three distinct review types in oncology.
RESULTS: The study encompassed three review types: treatment patterns (n=1,959), clinical (n=1,331), and epidemiology (n=4,124). Human reviewers included 213, 274, and 558 studies in these respective categories at title and abstract screening. In comparison to human-only output, the AI-assisted approach demonstrated high levels of accuracy, with scores of 75.65% (treatment patterns), 66.29% (clinical), and 91.03% (epidemiology). The AI-assisted approach also exhibited robust sensitivity, with recall rates of 74.65%, 68.98%, and 92.11% for the respective review types, demonstrating the AI's proficiency in identifying a minimum of two-thirds of pertinent citations across diverse review contexts. However, precision was consistently low across all reviews [27.32% (treatment patterns), 34.17% (clinical), and 61.19% (epidemiology)], suggesting higher inclusion of irrelevant citations by the AI. Low F1 scores of 40.00% (treatment patterns) and 45.63% (clinical) further reflected the imbalance between precision and recall, with only the epidemiology review (73.56%) showing a relatively balanced performance.
CONCLUSIONS: The AI-assisted screening approach demonstrates high accuracy and recall, showing promise despite over-inclusivity. While precision requires improvement, AI's strength lies in its comprehensive identification of relevant studies. Given the rapid advancements in AI technology, future iterations are likely to yield enhanced performance. AI-assisted approach supports the conduct of targeted reviews (epidemiology, treatment patterns) with reduced human involvement. For clinical reviews intended for reimbursement submissions, AI should serve as a supportive tool, assisting human reviewers to enhance accuracy.

Conference/Value in Health Info

2025-11, ISPOR Europe 2025, Glasgow, Scotland

Value in Health, Volume 28, Issue S2

Code

MSR102

Topic

Health Technology Assessment, Methodological & Statistical Research, Study Approaches

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

Oncology

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×