An Evaluation of the Rayyan Artificial Intelligence Tool for Systematic Literature Review Screening

Speaker(s)

Ng J1, Szydlowski N2, Gill M1, Fusco N1, Ruiz K1
1Cencora, Conshohocken, PA, USA, 2Cencora, Chicago, IL, USA

OBJECTIVES: Systematic literature reviews (SLR), the gold standard for evidence assessment, are both time-and labor-intensive. However, artificial intelligence (AI) is a promising technology for streamlining evidence synthesis processes. This research aims to evaluate the performance of the Rayyan AI tool (ie, Rayyan) for SLR title/abstract screening and quantify any screening efficiencies.

METHODS: Two SLRs previously dual-screened by human reviewers were identified. The SLRs assessed clinical, humanistic, and economic outcomes in the following therapeutic areas: ophthalmology (ie, SLR 1) and oncology (ie, SLR 2). Rayyan was trained on 20% of the total references (ie, training set) for both SLR 1 and SLR 2. Then, Rayyan predicted the relevance of the remaining references using a 5-level system ranging from “most likely to exclude” to “most likely to include”. Rayyan’s relevancy ratings were compared to the original SLR inclusion/exclusion decisions to calculate sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy. Time-savings were also calculated.

RESULTS: When references with Rayyan ratings of “most likely to include”, “likely to include”, and “no recommendation” were included, sensitivity ranged from 79%-100% across all outcomes, with the highest sensitivity reported for clinical and humanistic outcomes (88%-96% and 96%-100%, respectively). Specificity ranged from 8%-62%; PPV, 3%-44%; NPV, 90%-100%; and accuracy, 10%-64%. The greatest time savings were observed with the SLR 2 clinical outcome where utilizing 1 AI-assisted reviewer and 1 human reviewer yielded a 50% decrease in hours spent for title/abstract screening from the AI-assisted reviewer.

CONCLUSIONS: AI-assisted screening using Rayyan has shown high sensitivity and potential time savings. Despite these positive results, low-to-moderate specificity values limit time-saving benefits. Further research is necessary to investigate the most effective ways to incorporate AI-assisted processes in SLRs.

Code

MSR72

Topic

Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

No Additional Disease & Conditions/Specialized Treatment Areas