Assessing the Feasibility of Applying Natural Language Processing for Systematic Literature Reviews: A Case Study in Non-Small-Cell Lung Cancer

Speaker(s)

Rath N1, Walne H1, Louvet E1, Dunlop W1, Liljas B2
1AstraZeneca Pharmaceuticalls LP, Cambridge, CAM, Great Britain, 2AstraZeneca, Gaithersburg, MD, USA

OBJECTIVES: Many health technology assessment (HTA) organizations require systematic literature reviews (SLRs) as part of the submissions - however, SLRs are increasingly challenging to perform as the literature grows. The objective of this study was to assess the feasibility of applying natural language processing (NLP) to SLRs by comparing this approach to a previous “fully human” SLR in non-small-cell lung cancer (NSCLC).

METHODS: A search in Medline was conducted in February 2023 utilizing NLP through the i2e IQVIA NLP text mining application. Search strings, inclusion/exclusion criteria and date restrictions were applied from a “fully human” SLR conducted in August 2020 for Stage III NSCLC in accordance with key HTA guidelines. The study assessed the extent to which the NLP based search could replicate the original “fully human” search and reduce the extent of human efforts required. The sensitivity of the NLP based search was adjusted to identify all the studies in the original SLR.

RESULTS: The Medline literature search was reproduced using i2e NLP applying equivalent search terms and ontologies. Applying NLP allowed more specific queries, reducing the number of irrelevant documents. This narrowed down the documents for manual screening by 87% from (4,736 to 617), while identifying all 38 unique studies identified in the original “fully human” Medline search.

CONCLUSIONS: This study provides support that NLP can be used to substantially reduce SLR screening turnaround and human workload. Human supervision is still expected to be required with the current technology for SLRs, particularly for the remaining manual screening of abstracts/titles after applying NLP, analysis, and report writing. Data extraction is another area where NLP, or artificial intelligence (AI) more generally, could assist the SLR process. Given these developments, HTA organizations should consider reviewing their policies regarding the acceptance of AI for SLRs.

Code

MSR112

Topic

Health Technology Assessment, Methodological & Statistical Research, Organizational Practices, Study Approaches

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Best Research Practices, Literature Review & Synthesis, Value Frameworks & Dossier Format

Disease

No Additional Disease & Conditions/Specialized Treatment Areas, Oncology