An Evaluation of the Rayyan Artificial Intelligence Tool for Systematic Literature Review Screening

Speaker(s)

Ng J¹, Szydlowski N², Gill M¹, Fusco N¹, Ruiz K¹
¹Cencora, Conshohocken, PA, USA, ²Cencora, Chicago, IL, USA

Presentation Documents

2024 ISPOR evaluation of the Rayyan AI tool #2229_4.21.24_XF135370.pdf

OBJECTIVES: Systematic literature reviews (SLR), the gold standard for evidence assessment, are both time-and labor-intensive. However, artificial intelligence (AI) is a promising technology for streamlining evidence synthesis processes. This research aims to evaluate the performance of the Rayyan AI tool (ie, Rayyan) for SLR title/abstract screening and quantify any screening efficiencies.

METHODS: Two SLRs previously dual-screened by human reviewers were identified. The SLRs assessed clinical, humanistic, and economic outcomes in the following therapeutic areas: ophthalmology (ie, SLR 1) and oncology (ie, SLR 2). Rayyan was trained on 20% of the total references (ie, training set) for both SLR 1 and SLR 2. Then, Rayyan predicted the relevance of the remaining references using a 5-level system ranging from “most likely to exclude” to “most likely to include”. Rayyan’s relevancy ratings were compared to the original SLR inclusion/exclusion decisions to calculate sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy. Time-savings were also calculated.

RESULTS: When references with Rayyan ratings of “most likely to include”, “likely to include”, and “no recommendation” were included, sensitivity ranged from 79%-100% across all outcomes, with the highest sensitivity reported for clinical and humanistic outcomes (88%-96% and 96%-100%, respectively). Specificity ranged from 8%-62%; PPV, 3%-44%; NPV, 90%-100%; and accuracy, 10%-64%. The greatest time savings were observed with the SLR 2 clinical outcome where utilizing 1 AI-assisted reviewer and 1 human reviewer yielded a 50% decrease in hours spent for title/abstract screening from the AI-assisted reviewer.

CONCLUSIONS: AI-assisted screening using Rayyan has shown high sensitivity and potential time savings. Despite these positive results, low-to-moderate specificity values limit time-saving benefits. Further research is necessary to investigate the most effective ways to incorporate AI-assisted processes in SLRs.

Code

MSR72

Topic

Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

ISPOR 2024

May 5-8, 2024