Application of Artificial Intelligence in Literature Reviews


Venkata SK1, Velicheti S1, Jamdade V1, Ranganathan S1, Achra M1, Banerjee KK1, Dutta Gupta C1, Happich M2, Barrett A3
1Eli Lilly Services India Private Limited, Bengaluru, India, 2Eli Lilly and Company Limited, Basingstoke, Hampshire, UK, 3Eli Lilly and Company Limited, Bracknell, BRC, UK

OBJECTIVES: The purpose of this study was to evaluate the efficiency of the artificial intelligence (AI) tools of DistillerSR® in conducting critical and resource intensive title-abstract screening (Ti-Ab) in targeted literature reviews (TLR) and systematic literature reviews (SLR).

METHODS: A total of eight TLRs and three SLRs were conducted between February 2021 and June 2023 using ‘DistillerAI’ and ‘Classifiers’, respectively on DistillerSR® platform. The efficiency was assessed in terms of ‘screening burden’ and ‘accuracy’ (false negatives [FN] %). At least 10% of the total citations were manually screened (one review for TLRs and two reviews with an independent conflict resolver for SLRs) from each review and used as a ‘training set’ for AI. In TLRs, DistillerAI uses responses from the training set and provides the likelihood of relevance scores that range from ‘0’ (potential exclusions) to ‘1’ (potential inclusions) for unscreened citations. Whereas in SLRs, classifier (Include/exclude) uses the training set and reviews all unreviewed citations in one of the two-reviewer set. The classifiers are validated using ‘balance score’ and ‘recall score’ (proportion of True positives/negatives vs False positives/negatives).

RESULTS: Around 37,570 citations were identified from databases across the 11 reviews, out of which 12,464 were manually screened, and 25,106 were AI screened. The median (range) accuracy score across these reviews was 90% (84%-96%) with 1.64% of mean FNs, which was comparable to manual screening. In terms of screening burden, around 67% of human efforts were reduced by applying AI in the Ti-Ab screening.

CONCLUSIONS: AI was found to be an efficient tool for Ti-Ab screening, especially for large reference sets (>5,000). AI simulation tools are useful in prioritizing likely inclusions and exclusions; however, additional quality checks are required to meet rigorous requirements of Health Technology Assessment. Further research is needed around recommendations for optimal integration of AI in literature reviews.

Conference/Value in Health Info

2023-11, ISPOR Europe 2023, Copenhagen, Denmark

Value in Health, Volume 26, Issue 11, S2 (December 2023)




Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics


No Additional Disease & Conditions/Specialized Treatment Areas

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on Update my browser now