AutoCriteria: Advancing Clinical Trial Study With AI-Powered Eligibility Criteria Extraction
Data S1, Lee K2, Paek H1, Manion FJ1, Ofoegbu N1, Du J1, Li Y3, Huang LC1, Wang J1, Lin B1, Xu H4, Wang X1
1Melax Technology, part of IMO, Houston, TX, USA, 2Melax Technology, part of IMO, Ardsley, NY, USA, 3Regeneron Pharmaceuticals, Tarrytown, NY, USA, 4Yale Univiersity, New Haven, CT, USA
OBJECTIVES: Natural Language Processing (NLP) techniques offer the potential to enhance the efficiency of clinical trial studies by automatically extracting eligibility criteria. Nevertheless, existing NLP approaches face challenges in capturing fine-grained criteria within a given text and may lack applicability across various disease areas. Our aim is to develop a system that automatically extracts eligibility criteria, emphasizes contextual attributes, and can handle diverse diseases utilizing a cutting-edge large language model.
METHODS: We acquired clinical trial data from ClinicalTrials.gov, spanning oncologic, neurodegenerative, autoimmune, endocrine, and circulatory system disorders. Our system comprised pre-processing, knowledge ingestion, GPT-based prompt modeling, post-processing, and interim evaluation modules. The prompt was designed to be applicable across diverse disease areas. We conducted quantitative and qualitative evaluations on 180 manually annotated trials, covering nine distinct diseases.
RESULTS: The AutoCriteria system we developed exhibited outstanding performance in the identification of criteria entities, attaining an overall F1 score of 89.42 across the various analyzed diseases. Notably, the individual scores demonstrated a consistent proficiency, ranging from 84.10 to 95.44. The system effectively handles the intricate contextual aspects of criteria by emphasizing crucial attributes such as value, temporality, and logical relationships between criteria. It achieved an accuracy of 78.95% across all diseases. Moreover, the system successfully manages complex scenarios involving multiple arm conditions, as confirmed by our comprehensive evaluation across all disease areas.
CONCLUSIONS: Our AutoCriteria system, powered by GPT, demonstrates significant promise in optimizing resource allocation, cost reduction, and efficiency by minimizing the reliance on manual annotations. By offering enhanced granularity, improved accuracy, and the ability to generalize across diverse disease domains, this system has the capacity to streamline the entire process of clinical trial studies. Consequently, this system has the capability to reduce the time required to initiate and conduct clinical trials, ultimately benefiting both investigators and patients.
Conference/Value in Health Info
Value in Health, Volume 26, Issue 11, S2 (December 2023)
Methodological & Statistical Research, Study Approaches
Artificial Intelligence, Machine Learning, Predictive Analytics, Clinical Trials
Neurological Disorders, No Additional Disease & Conditions/Specialized Treatment Areas, Oncology, Rare & Orphan Diseases