Improving the Performance of Generative AI to Achieve 100% Accuracy in Data Extraction


Klijn S1, Teitsson S2, Reason T3, Malcolm B2, Hill N4, Benbow E5
1Bristol-Myers Squibb, Utrecht, ZH, Netherlands, 2Bristol Myers Squibb, Uxbridge, UK, 3Estima Scientific Ltd, South Ruislip, LON, UK, 4Bristol Myers Squibb Company, Princeton, NJ, USA, 5Estima Scientific Ltd, Ruislip, UK

OBJECTIVES: We have previously demonstrated that there is potential to use large language models (LLMs), such as GPT-4, to automate data extraction for NMA. Whilst data extraction accuracy of over 97% was achieved, there is scope to improve the performance and reliability of data extraction to 100%, before full implementation in HEOR. The aim of this study was to assess improvements in accuracy of data extraction from publications reporting overall survival in adult patients with advanced or metastatic non-small cell lung cancer (NSCLC), using a modal approach.

METHODS: An a priori defined modal algorithm was postulated, developed, and tested. This used GPT-4, via a Python API, to automatically extract survival data from NSCLC publications multiple times and then calculate the mode of each block of 20 iterations. Results were compared with the data extraction conducted (and checked) by systematic literature review and NMA experts.

RESULTS: When comparing the results of 400 iterations of the automatic data extraction with the human data extraction, GPT-4 accurately extracted over 99% of the necessary data. However, by implementing the modal algorithm it was possible to achieve a data extraction accuracy of 100% for all 20x20 blocks of data.

CONCLUSIONS: Whilst GPT-4 generally extracts the correct data, there are occasions when it fails to extract all required data from a publication. We have demonstrated an approach that improves the extraction rate and, in the case study considered, results in perfect extraction by GPT-4. This represents a useful method to demonstrate the accuracy, repeatability and reliability of data extracted. Work to apply this approach to the other automated stages of network meta-analysis is underway.

Conference/Value in Health Info

2024-05, ISPOR 2024, Atlanta, GA, USA

Value in Health, Volume 27, Issue 6, S1 (June 2024)




Clinical Outcomes, Methodological & Statistical Research, Study Approaches

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Comparative Effectiveness or Efficacy, Meta-Analysis & Indirect Comparisons


No Additional Disease & Conditions/Specialized Treatment Areas, Oncology

Explore Related HEOR by Topic

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on Update my browser now