AN APPLICATION OF ARTIFICIAL INTELLIGENCE-BASED METHODOLOGY IN LITERATURE REVIEWS

Author(s)

Wu EQ¹, Ayyagari R¹, Royer J², Li J¹, Wang J¹, Lefebvre P³, Patterson-Lomba O¹
¹Analysis Group, Inc., Boston, MA, USA, ²Analysis Group, Inc., Montreal, QC, Canada, ³Analysis Group, Inc., Montréal, QC, Canada

OBJECTIVES: Medical literature reviews are essential to evidence-based practice but are resource-intensive and subject to lack of transparency and reproducibility. We examined the use of Machine Learning (ML) and Natural Language Processing (NLP) to improve targeted literature review and systematic literature review. METHODS: For a targeted review, a reviewer screened 20% of the abstracts to serve as a training data set for the ML model. The remaining abstracts were assigned relevance scores by the trained model, and the reviewer screened all abstracts with scores above a relevance cut-off. For the systematic review, two reviewers screened 8% of identified abstracts and a third reviewer resolved disagreements for the training sample. The ML algorithm and one reviewer screened the remaining articles separately with a third reviewer resolving disagreements between the algorithm and human. Models performance was assessed in terms of accuracy and overall time savings. RESULTS: For the targeted review, 2,180 abstracts were identified; 450 were reviewed by one reviewer and used to train the AI algorithms. The remaining 1,730 abstracts were ranked by the likelihood of relevance and 240 abstracts were selected using an optimal threshold. Of the 240 abstracts, 130 were identified as relevant by the reviewer. AI-assisted screening required 35 hours vs. an estimated 65 hours with the traditional approach. For the systematic review, 3,767 abstracts were identified and 300 were randomly selected and reviewed by two reviewers to train the AI algorithms. An additional 364 abstracts were randomly selected and reviewed by the algorithms and one reviewer separately and 34 discrepancies between the reviewer and algorithms were reconciled. AI-assisted screening required 135 hours vs. an estimated 200 hours with the traditional approach. CONCLUSIONS: Properly trained ML and NLP algorithms can provide accurate abstract screening and significant efficiency gains while improving transparency and reproducibility.

Conference/Value in Health Info

2018-11, ISPOR Europe 2018, Barcelona, Spain

Value in Health, Vol. 21, S3 (October 2018)

Code

PRM81

Topic

Real World Data & Information Systems

Topic Subcategory

Reproducibility & Replicability

Disease

Geriatrics, Pediatrics, Reproductive and Sexual Health

Explore Related HEOR by Topic

Real-World Data

Presentation