Can Generative Artificial Intelligence (GenAI) Be Leveraged to Automate Scoping Searches of Systematic Literature Reviews (SLRs)?
Author(s)
Anna Maria Mavrigiannaki, MSc1, Theodora Oikonomidi, PhD1, Nimran Kaur, PhD2.
1IQVIA, Athens, Greece, 2IQVIA, Bangalore, India.
1IQVIA, Athens, Greece, 2IQVIA, Bangalore, India.
Presentation Documents
OBJECTIVES: Scoping searches are essential in SLR planning, because they provide the expected number of search results, which determines expected effort. Traditionally, scoping searches have been developed by subject matter experts (SMEs).This study evaluated the performance of Microsoft Copilot® in assisting human reviewers with the development of SLR scoping searches.
METHODS: Searches previously developed by SMEs for three economic SLRs were compared with scoping searches developed by Copilot® using a prompt devised by an SME. Copilot® was advised to develop a search for Embase® via OVID SP®, based on three inputs: 1) research question, 2) population, intervention, comparison, outcome, and study design (PICOS) criteria, and 3) two studies the search should retrieve (test set). The search developed by Copilot® was run and the results were reviewed. The authors evaluated: 1) if the Copilot®-developed search code worked on OVID SP®, 2) the total number of hits by the Copilot® search versus the original SLR search, and 3) the proportion of included articles in the completed SLRs that the Copilot® search retrieved.
RESULTS: In all cases, the Copilot® generated search code required SME corrections to run, due to including invalid subject headings, syntax errors, and unsupported characters (e.g. ö). In two cases, Copilot® searches overestimated the number of hits (by 388-614 hits, i.e., 34%-45% more than the original SLR search). However, the Copilot® search retrieved 90% and 91% of the reports included in the original SLRs, respectively. In the third case, the Copilot® search underestimated the number of hits (by 244, 15%) and retrieved 23% of the reports included in the original SLR.
CONCLUSIONS: Copilot® performance at estimating search hits was modest, but in two of three cases the Copilot® search identified 90% of the relevant studies. Copilot® could assist junior reviewers with search development, particularly for targeted literature reviews, but SME revision remains indispensable.
METHODS: Searches previously developed by SMEs for three economic SLRs were compared with scoping searches developed by Copilot® using a prompt devised by an SME. Copilot® was advised to develop a search for Embase® via OVID SP®, based on three inputs: 1) research question, 2) population, intervention, comparison, outcome, and study design (PICOS) criteria, and 3) two studies the search should retrieve (test set). The search developed by Copilot® was run and the results were reviewed. The authors evaluated: 1) if the Copilot®-developed search code worked on OVID SP®, 2) the total number of hits by the Copilot® search versus the original SLR search, and 3) the proportion of included articles in the completed SLRs that the Copilot® search retrieved.
RESULTS: In all cases, the Copilot® generated search code required SME corrections to run, due to including invalid subject headings, syntax errors, and unsupported characters (e.g. ö). In two cases, Copilot® searches overestimated the number of hits (by 388-614 hits, i.e., 34%-45% more than the original SLR search). However, the Copilot® search retrieved 90% and 91% of the reports included in the original SLRs, respectively. In the third case, the Copilot® search underestimated the number of hits (by 244, 15%) and retrieved 23% of the reports included in the original SLR.
CONCLUSIONS: Copilot® performance at estimating search hits was modest, but in two of three cases the Copilot® search identified 90% of the relevant studies. Copilot® could assist junior reviewers with search development, particularly for targeted literature reviews, but SME revision remains indispensable.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
MSR53
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
No Additional Disease & Conditions/Specialized Treatment Areas