Automation of Title and Abstract Screening: CAN Robots Replace Humans?
Abogunrin S1, Queiros L2, Witzmann A3, Sumner M4, Wehler P4, Baehrens D4
1F. Hoffmann-La Roche Ltd., Basel, Switzerland, 2F. Hoffmann La Roche, Basel, Switzerland, 3F. Hoffmann La Roche, Kaiseraugst, AG, Switzerland, 4Averbis GmbH, Freiburg, Germany
Background: Support vector machines (SVM) are an artificial intelligence method that have previously been used to automate the title and abstract screening (TIABS) step of systematic literature reviews (SLRs). Nevertheless, questions remain on the reproducibility of this method for different types of SLRs. We assessed the use of SVMs for automated TIABS in various therapeutic areas and review types. Methods: Ten previously completed human-performed SLRs spanning various therapeutic areas were identified. For every SLR, multiple SVMs were independently trained following which they were used to assign an include or exclude status plus exclusion reason to each record, along with a confidence estimate (range: 0–1). To ensure relevant records were not missed, records with a confidence estimate of <0.8 were included by default. The automatic classifications were compared to the human classifications, using confusion matrices, precision, recall, and F1 score. Results: The research questions included clinical and economic SLRs in oncology, infectious diseases and haematology. The dataset sample sizes varied between 319 and 16,962 records. The recall, precision, and F1 scores for include versus exclude classification ranged between 0.90 and 1.00, 0.02 and 0.37, and 0.05 and 0.53, respectively, and between 0.00 and 0.97, 0.00 and 0.96, and 0.00 and 0.97, when exclusion reasons were considered. Conclusions: The analysis consistently found a high recall for all investigated SLR questions, resulting in several false positives but with little or no relevant record being missed. As such, SVMs alone may not be sufficient for TIABS automation and should be investigated in combination with other artificial intelligence methods with text mining capabilities, to optimise the time needed for the screening of title and abstract records when conducting an SLR.
Conference/Value in Health Info
2021-11, ISPOR Europe 2021, Copenhagen, Denmark
Value in Health, Volume 24, Issue 12, S2 (December 2021)
Methodological & Statistical Research
Artificial Intelligence, Machine Learning, Predictive Analytics
No Specific Disease