Large Language Models for Data Extraction in a Systematic Review: A Case Study

Speaker(s)

Edwards M¹, Ferrante di Ruffano L²
¹York Health Economics Consortium, York, YOR, UK, ²York Health Economics Consortium, York, NYK, UK

Presentation Documents

ISPOREurope24_Ferrante di Ruffano_MSR117_POSTER145202.pdf

OBJECTIVES: A typical systematic review includes extraction of highly granulated data in a standardized format, a resource intensive part of the review process. We investigated whether the chat interface to a large language model (Claude 3 Opus) could provide time savings in extracting such data while retaining the accuracy necessary for a systematic review.

METHODS: A data extraction sheet from a completed review of biologic treatments was selected. A set of prompts was designed to obtain details of the methods, interventions, and populations assessed by three of the included studies. Each paper was uploaded individually, and the results were copied into the original data sheet and compared with those produced and checked by two independent human reviewers. Testing of outcome extraction was also conducted.

RESULTS: In order to produce suitably formatted granular data, prompts were detailed and consistently structured. Although the model successfully extracted details of the intervention (including dose, scheduling and duration of treatment) and population (including age, gender, duration of disease, and exon10 variants) assessed in each arm, it struggled to interpret complex patient flow through the studies. Primary outcomes in the ITT population were successfully extracted, but extraction of secondary outcomes, subgroups, and outcomes at different timepoints proved much less reliable.

CONCLUSIONS: While chat interfaces to LLMs may provide some time savings in extracting basic study data, such interfaces do not lend themselves to the detailed prompts required for successful extraction of more complex data. Accessing a LLM outside a chat interface can be costly, and requires a skillset not possessed by the majority of reviewers; organizational investment may therefore be needed to facilitate productive access. Fine-tuning using archive data also raises issues of commercial confidentiality. A market is emerging for companies providing affordable access to a protected model, accessible only to the customer and fine-tuned to their needs.

Code

MSR117

Topic

Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

ISPOR Europe 2024

17 - 20 November