AN APPLICATION OF ARTIFICIAL INTELLIGENCE-BASED METHODOLOGY IN LITERATURE REVIEWS
Author(s)
Wu EQ1, Ayyagari R1, Royer J2, Li J1, Wang J1, Lefebvre P3, Patterson-Lomba O1
1Analysis Group, Inc., Boston, MA, USA, 2Analysis Group, Inc., Montreal, QC, Canada, 3Analysis Group, Inc., Montréal, QC, Canada
OBJECTIVES: Medical literature reviews are essential to evidence-based practice but are resource-intensive and subject to lack of transparency and reproducibility. We examined the use of Machine Learning (ML) and Natural Language Processing (NLP) to improve targeted literature review and systematic literature review. METHODS: For a targeted review, a reviewer screened 20% of the abstracts to serve as a training data set for the ML model. The remaining abstracts were assigned relevance scores by the trained model, and the reviewer screened all abstracts with scores above a relevance cut-off. For the systematic review, two reviewers screened 8% of identified abstracts and a third reviewer resolved disagreements for the training sample. The ML algorithm and one reviewer screened the remaining articles separately with a third reviewer resolving disagreements between the algorithm and human. Models performance was assessed in terms of accuracy and overall time savings. RESULTS: For the targeted review, 2,180 abstracts were identified; 450 were reviewed by one reviewer and used to train the AI algorithms. The remaining 1,730 abstracts were ranked by the likelihood of relevance and 240 abstracts were selected using an optimal threshold. Of the 240 abstracts, 130 were identified as relevant by the reviewer. AI-assisted screening required 35 hours vs. an estimated 65 hours with the traditional approach. For the systematic review, 3,767 abstracts were identified and 300 were randomly selected and reviewed by two reviewers to train the AI algorithms. An additional 364 abstracts were randomly selected and reviewed by the algorithms and one reviewer separately and 34 discrepancies between the reviewer and algorithms were reconciled. AI-assisted screening required 135 hours vs. an estimated 200 hours with the traditional approach. CONCLUSIONS: Properly trained ML and NLP algorithms can provide accurate abstract screening and significant efficiency gains while improving transparency and reproducibility.
Conference/Value in Health Info
2018-11, ISPOR Europe 2018, Barcelona, Spain
Value in Health, Vol. 21, S3 (October 2018)
Code
PRM81
Topic
Real World Data & Information Systems
Topic Subcategory
Reproducibility & Replicability
Disease
Geriatrics, Pediatrics, Reproductive and Sexual Health