Automation of Title and Abstract Screening: CAN Robots Replace Humans?

Author(s)

Abogunrin S¹, Queiros L², Witzmann A³, Sumner M⁴, Wehler P⁴, Baehrens D⁴
¹F. Hoffmann-La Roche Ltd., Basel, Switzerland, ²F. Hoffmann La Roche, Basel, Switzerland, ³F. Hoffmann La Roche, Kaiseraugst, AG, Switzerland, ⁴Averbis GmbH, Freiburg, Germany

Background: Support vector machines (SVM) are an artificial intelligence method that have previously been used to automate the title and abstract screening (TIABS) step of systematic literature reviews (SLRs). Nevertheless, questions remain on the reproducibility of this method for different types of SLRs. We assessed the use of SVMs for automated TIABS in various therapeutic areas and review types.

Methods: Ten previously completed human-performed SLRs spanning various therapeutic areas were identified. For every SLR, multiple SVMs were independently trained following which they were used to assign an include or exclude status plus exclusion reason to each record, along with a confidence estimate (range: 0–1). To ensure relevant records were not missed, records with a confidence estimate of <0.8 were included by default. The automatic classifications were compared to the human classifications, using confusion matrices, precision, recall, and F1 score.

Results: The research questions included clinical and economic SLRs in oncology, infectious diseases and haematology. The dataset sample sizes varied between 319 and 16,962 records. The recall, precision, and F1 scores for include versus exclude classification ranged between 0.90 and 1.00, 0.02 and 0.37, and 0.05 and 0.53, respectively, and between 0.00 and 0.97, 0.00 and 0.96, and 0.00 and 0.97, when exclusion reasons were considered.

Conclusions: The analysis consistently found a high recall for all investigated SLR questions, resulting in several false positives but with little or no relevant record being missed. As such, SVMs alone may not be sufficient for TIABS automation and should be investigated in combination with other artificial intelligence methods with text mining capabilities, to optimise the time needed for the screening of title and abstract records when conducting an SLR.

Conference/Value in Health Info

2021-11, ISPOR Europe 2021, Copenhagen, Denmark

Value in Health, Volume 24, Issue 12, S2 (December 2021)

Code

POSA318

Topic

Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

No Specific Disease

Explore Related HEOR by Topic

Methodology

Presentation