Optimizing Systematic Literature Reviews in Endometrial Cancer: Leveraging AI for Real-Time Article Screening and Data Extraction in Clinical Trials

Author(s)

Datta S¹, Lee K², Paek H¹, Mojarad MR¹, Prabhu V³, Zhang J³, Foley E¹, Glasgow J¹, Liston C¹, Zheng Y³, Huang YL³, Du J¹, Wang X¹, Cossrow N³, Cai J⁴, Wang D³
¹IMO health, Rosemont, IL, USA, ²IMO health, Ardsley, NY, USA, ³Merck & Co., Inc., Rahway, NJ, USA, ⁴Merck & Co., Inc., West Point, PA, USA

OBJECTIVES: Traditional systematic literature review (SLR) methodologies, while high in quality, suffer from inefficiency due to their time-consuming and resource-intensive nature, aggravated by the increasing volume of trials and literature. Our goal is to improve this process by developing an efficient Artificial intelligence (AI) system, using Generative Pre-Trained Transformer 4 (GPT-4), for real-time article screening and data extraction.

METHODS: We collected endometrial cancer clinical trial studies from PubMed using specific keywords to construct our GPT system. Our GPT-4-based automatic literature review system comprises three components - abstract screening, full-text screening, and data element extraction from abstracts. Both abstract and full-text screening components generate the eligibility decision based on specific inclusion and exclusion criteria outlined in the study protocols. Data element extraction identifies important study details (e.g., population size) and clinical trial outcomes (e.g., overall response rate). Quantitative and qualitative evaluation was conducted to assess the system performance, based on 50 randomly selected endometrial cancer clinical trial articles labeled by experts.

RESULTS: For abstract screening, the system’s accuracy is 86% with precision, recall (sensitivity), and F1 of 0.86, 0.94, and 0.90 respectively. The system’s accuracy in full-text screening is 78.95%. For data element extraction, the F1 scores in identifying the study details and clinical outcome values are 0.97 and 0.88, respectively (strict matching) and 0.99 and 0.97 respectively (relaxed matching). Qualitative analysis identified challenges including accurately interpreting screening criteria language and capturing partial information for element extraction. Running the whole algorithm took 42 minutes for screening and extraction.

CONCLUSIONS: Our real-time article screening and data extraction AI system positively identified the majority (94%) of the manually labeled eligible articles in a fraction of the time required for manual identification. The AI system has the potential to be used for real-time updates and timely insights.

Conference/Value in Health Info

2024-05, ISPOR 2024, Atlanta, GA, USA

Value in Health, Volume 27, Issue 6, S1 (June 2024)

Code

MSR103

Topic

Clinical Outcomes, Methodological & Statistical Research, Organizational Practices

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Clinical Outcomes Assessment, Industry

Disease

Oncology

Explore Related HEOR by Topic

Presentation