Advancing Systematic Literature Reviews: The Integration of AI-Powered NLP Models in Data Collection Processes

Speaker(s)

Rai P¹, Kaur R¹, Pandey S¹, Attri S¹, Kaur G¹, Singh B²
¹Pharmacoevidence, Mohali, India, ²Pharmacoevidence, SAS Nagar Mohali, PB, India

Presentation Documents

ISPOR24_MSR59_Pandey_Poster140176.pdf

OBJECTIVES: Incorporating natural language processing (NLP) models to automate systematic literature reviews (SLRs) signifies a groundbreaking progress in research methodology. These models harness the power of artificial intelligence (AI) to efficiently navigate through extensive contents, streamlining the SLR process undertaken by human reviewers. The aim of this work is to assess the efficiency of different NLP models - BERT, DistilBERT, ALBERTa, and XLNet based on semantic analysis of their titles and abstracts.

METHODS: Python-based NLP models were developed to enhance the efficiency of screening literature for SLRs. A domain expert with over a decade of experience, manually screened the title and abstracts of the data sample for the purpose of model training and improvement. Subsequently, to effectively capture contextual relationships within the data, texts underwent tokenization using tokenizers specific to the models. The models' performance was validated using the remaining data, which constituted previously unseen data. To address class imbalance, we employed random oversampling to ensure a balanced dataset. To ensure a thorough and accurate assessment of the models' capabilities, the training set was subdivided through K-Fold Cross-Validation, enhancing robustness in evaluation.

RESULTS: Across the NLP models considered for title and abstract based screening, BERT showcased outstanding performance in the validation phase, attaining an accuracy of 90.05% and a sensitivity of 84.16%, surpassing other models. DistilBERT closely followed with competitive results, achieving an accuracy and sensitivity of 88.90% and 75.25%, respectively. XLNet performed well, securing an accuracy of 87.24% and a sensitivity of 81.19%. Nevertheless, ALBERTa demonstrated a marginally lower accuracy of 78.34%, coupled with a sensitivity of 82.35%, suggesting a relative performance dip.

CONCLUSIONS: The NLP-driven strategy employed in automating SLR screening proved effective, with the BERT algorithm exhibiting the highest accuracy among all models studied. This automation reduces manual workload, enhancing efficiency and representing a significant improvement in optimizing the SLR process.

Code

MSR59

Topic

Methodological & Statistical Research, Study Approaches

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Literature Review & Synthesis, Meta-Analysis & Indirect Comparisons

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

ISPOR 2024

May 5-8, 2024