Natural Language Processing to Support the Abstract Selection During Systematic Literature Review

Speaker(s)

Damentko M¹, Hasni MB², Abdellaoui MB², Barbier S³, Nikodem M⁴
¹Putnam, Kraków, MA, Poland, ²Putnam, Tunis, Tunisia, ³Putnam, Lyon, France, ⁴Putnam, Kraków, Poland

Presentation Documents

Sylvaine Barbier_2023_Natural Language Processing to Support the Abstract Selection during Systematic Literature Review132970.pdf

OBJECTIVES: Literature selection is a very time-consuming process and will only get more burdensome as the number of scientific publications steadily increases. Several Natural Language Processing (NLP) algorithms have been proposed for automatic reading and could be applied in this context. This research aims to apply NLP-based algorithm on validated historical systematic literature review (SLR) to perform abstract selection.

METHODS: We selected an NLP algorithm based on SCI BERT model, pretrained on a corpus of 1.14M papers from Semantic Scholar. The BERT model architecture is based on a multilayer bidirectional transformer. To balance the input, we employed the under-sampling method to mitigate bias towards the majority class. Subsequently, we fine-tuned the model using a custom dataset. The algorithm was written using Pytorch.

RESULTS: For evaluating the model, we used an SLR on economic models in chronic kidney disease of 15,059 abstracts, selected by two independent reviewers. In base scenario, 50% of the abstracts were used for training the algorithm and 50% for testing, assuming the reviewers’ consensus as gold standard. The sensitivity and specificity of the algorithm were 0.90 and 0.87, respectively. If only one of the reviewers is replaced by the algorithm, the sensitivity and specificity of the selection process were 0.97 and 0.99, respectively. With this approach, without compromising on the quality, the human reviewer could avoid reading over 6800 abstracts, accounting for additional review slot judging discrepancies between the human reviewer and the model. This saves over 17 person-days assuming screening 400 items per day. Several other settings and SLR data were used for testing, showing promising results especially for large datasets.

CONCLUSIONS: Including NLP algorithms in the abstract selection process allows for considerable acceleration while maintaining high quality standards. These algorithms could be further enriched by the mapping of the PICOs, and additional fine tuning.

Code

MSR66

Topic

Methodological & Statistical Research, Study Approaches

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Literature Review & Synthesis

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

ISPOR Europe 2023

12 - 15 November