Re-Training of the Artificial Intelligence (AI) Tool LIVESTARTTM with Additional Datasets: Updated Accuracy in the Title and Abstract Review Stage of Systematic Literature Reviews (SLR)
Author(s)
Liu J1, Jafar R2, Girard LA3, Thorlund K4, Forsythe A4
1Cytel Inc., Toronto, ON, Canada, 2Cytel Inc., Vancouver, BC, Canada, 3Cytel Inc., Montreal, QC, Canada, 4Cytel, Waltham, MA, USA
Presentation Documents
OBJECTIVES: SLRs are labor-intensive and time-consuming, however, they are required for regulatory and health technology assessments (HTA). We previously reported a novel LiveSTARTTM AI tool utilizing transfer learning to perform the title and abstract (TiAb) review stage of SLR processes with reported accuracy of 0.92, precision = 0.91, recall = 0.86, F1-score = 0.89, and AUC = 0.91. To increase our confidence in LiveSTARTTM prediction, we investigate whether the re-training with more datasets could improve the performance of LiveSTARTTM.
METHODS: LiveSTARTTM utilizes deep learning (12-layer neural network) to identify texts relevant to population, intervention/comparator, outcome, and study design (PICOS), and then hierarchically predicts publication acceptance based on given inclusion/exclusion criteria. In addition to the 59 SLR datasets with 65,328 publications that were reported in November 2022, 42 more datasets were prepared with 18,371 publications to be used towards the re-training of the existing machine learning model. All the new datasets were manually annotated by two independent reviewers and the discrepancies were verified by a third senior reviewer.
RESULTS: 42 datasets covered 10 oncology indications with 22 economic and 20 health-related quality-of-life (HR-QOL) SLRs. In total, LiveSTARTTM has now been trained with 101 datasets covering 47 clinical, 28 economic and 26 HR-QOL datasets. An average increase of 2% in accuracy=0.01, precision=0.02, recall=0.02, F1-score=0.025, and AUC=0.02 were observed compared to the original results published in November 2022.
CONCLUSIONS: Machine learning algorithms rely on supervised training to improve performance. The size of training datasets and performance have been reported to correlate positively. We are expecting to see better predictions with this 58% increase in training data. LiveSTARTTM AI could potentially yield significant time savings. However, adoption by regulatory and HTA authorities will be required.
Conference/Value in Health Info
Value in Health, Volume 26, Issue 6, S2 (June 2023)
Code
MSR33
Topic
Methodological & Statistical Research, Study Approaches
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics, Literature Review & Synthesis
Disease
No Additional Disease & Conditions/Specialized Treatment Areas