Large Language Models for Data Extraction in a Systematic Review: A Case Study
Speaker(s)
Edwards M1, Ferrante di Ruffano L2
1York Health Economics Consortium, York, YOR, UK, 2York Health Economics Consortium, York, NYK, UK
Presentation Documents
OBJECTIVES: A typical systematic review includes extraction of highly granulated data in a standardized format, a resource intensive part of the review process. We investigated whether the chat interface to a large language model (Claude 3 Opus) could provide time savings in extracting such data while retaining the accuracy necessary for a systematic review.
METHODS: A data extraction sheet from a completed review of biologic treatments was selected. A set of prompts was designed to obtain details of the methods, interventions, and populations assessed by three of the included studies. Each paper was uploaded individually, and the results were copied into the original data sheet and compared with those produced and checked by two independent human reviewers. Testing of outcome extraction was also conducted.
RESULTS: In order to produce suitably formatted granular data, prompts were detailed and consistently structured. Although the model successfully extracted details of the intervention (including dose, scheduling and duration of treatment) and population (including age, gender, duration of disease, and exon10 variants) assessed in each arm, it struggled to interpret complex patient flow through the studies. Primary outcomes in the ITT population were successfully extracted, but extraction of secondary outcomes, subgroups, and outcomes at different timepoints proved much less reliable.
CONCLUSIONS: While chat interfaces to LLMs may provide some time savings in extracting basic study data, such interfaces do not lend themselves to the detailed prompts required for successful extraction of more complex data. Accessing a LLM outside a chat interface can be costly, and requires a skillset not possessed by the majority of reviewers; organizational investment may therefore be needed to facilitate productive access. Fine-tuning using archive data also raises issues of commercial confidentiality. A market is emerging for companies providing affordable access to a protected model, accessible only to the customer and fine-tuned to their needs.
Code
MSR117
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
No Additional Disease & Conditions/Specialized Treatment Areas