An Evaluation of PhlexNeuron, an Internal, Proprietary Artificial Intelligence (AI) Tool for Systematic Literature Review (SLR) Screening

Author(s)

Nicole Szydlowski, PharmD1, Brittany Galloway, PharmD2, Malia Gill, MS2, Derek Swiger, MS, PharmD2, Daniel Koppers, .2, Suresh Shankar, MBA2, Kimberly M. Ruiz, EdM2, Nicole Fusco, ScD2.
1Medical Communications Research Fellow, Cencora, Inc., Conshohocken, PA, USA, 2Cencora, Inc., Conshohocken, PA, USA.
OBJECTIVES: Systematic literature reviews (SLRs) are an essential tool for evidence-based decision making. However, their rigorous methodology requires substantial time and cost investment. Several tools are available to perform literature screening assisted by artificial intelligence (AI); these tools typically require a training set of example references for each new SLR. The aim of this research is to assess the performance of a proprietary, internal AI tool for literature screening that does not require a training set which will likely produce time-savings.
METHODS: Title/abstract screening was previously completed in 4 SLRs by human reviewers. The SLRs evaluated clinical, costs and health-care resource utilization (HCRU), economic evaluation, and humanistic outcomes. Eligibility questions were generated using the population, intervention, comparator, outcome, and study design (PICOS) criteria from the original SLRs. The AI tool was prompted to answer the PICOS questions and to provide a screening recommendation (include, exclude, or uncertain). Analyses were conducted to compare human and AI results, including sensitivity and specificity.
RESULTS: The clinical, costs/HCRU, economic evaluation, and humanistic datasets included 4233, 1908, 1289, and 2637 references, respectively. When the AI screening recommendations were compared to human reviewers, the sensitivity and specificity estimates were 91%/65%, 89%/50%, 93%/45%, and 89%/71% for the clinical, costs/HCRU, economic evaluation, and humanistic SLRs, respectively. The AI tool also provided an explanation for its response to each PICOS question.
CONCLUSIONS: The AI literature screening tool was able to identify the majority of relevant articles with sensitivity estimates greater than 89% and specificity estimates greater than 45%. Therefore, AI-assisted screening using prompts based on the PICOS framework is feasible and the explanations alongside the AI responses to each PICOS question can increase transparency. These study results can be used to inform future refinement of AI tools that do not require training sets for SLR screening processes.

Conference/Value in Health Info

2025-05, ISPOR 2025, Montréal, Quebec, CA

Value in Health, Volume 28, Issue S1

Code

MSR121

Topic

Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

SDC: Sensory System Disorders (Ear, Eye, Dental, Skin)

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×