Re-Training of the Artificial Intelligence (AI) Tool LIVESTARTTM with Additional Datasets: Updated Accuracy in the Title and Abstract Review Stage of Systematic Literature Reviews (SLR)

Author(s)

Liu J¹, Jafar R², Girard LA³, Thorlund K⁴, Forsythe A⁴
¹Cytel Inc., Toronto, ON, Canada, ²Cytel Inc., Vancouver, BC, Canada, ³Cytel Inc., Montreal, QC, Canada, ⁴Cytel, Waltham, MA, USA

Presentation Documents

MSR33_ISPOR 2023_Liu_Poster_Final126821.pdf

OBJECTIVES: SLRs are labor-intensive and time-consuming, however, they are required for regulatory and health technology assessments (HTA). We previously reported a novel LiveSTART^TM AI tool utilizing transfer learning to perform the title and abstract (TiAb) review stage of SLR processes with reported accuracy of 0.92, precision = 0.91, recall = 0.86, F1-score = 0.89, and AUC = 0.91. To increase our confidence in LiveSTART^TM prediction, we investigate whether the re-training with more datasets could improve the performance of LiveSTART^TM.

METHODS: LiveSTART^TM utilizes deep learning (12-layer neural network) to identify texts relevant to population, intervention/comparator, outcome, and study design (PICOS), and then hierarchically predicts publication acceptance based on given inclusion/exclusion criteria. In addition to the 59 SLR datasets with 65,328 publications that were reported in November 2022, 42 more datasets were prepared with 18,371 publications to be used towards the re-training of the existing machine learning model. All the new datasets were manually annotated by two independent reviewers and the discrepancies were verified by a third senior reviewer.

RESULTS: 42 datasets covered 10 oncology indications with 22 economic and 20 health-related quality-of-life (HR-QOL) SLRs. In total, LiveSTART^TM has now been trained with 101 datasets covering 47 clinical, 28 economic and 26 HR-QOL datasets. An average increase of 2% in accuracy=0.01, precision=0.02, recall=0.02, F1-score=0.025, and AUC=0.02 were observed compared to the original results published in November 2022.

CONCLUSIONS: Machine learning algorithms rely on supervised training to improve performance. The size of training datasets and performance have been reported to correlate positively. We are expecting to see better predictions with this 58% increase in training data. LiveSTART^TM AI could potentially yield significant time savings. However, adoption by regulatory and HTA authorities will be required.

Conference/Value in Health Info

2023-05, ISPOR 2023, Boston, MA, USA

Value in Health, Volume 26, Issue 6, S2 (June 2023)

Code

MSR33

Topic

Methodological & Statistical Research, Study Approaches

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Literature Review & Synthesis

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Explore Related HEOR by Topic

Methodology

Presentation