Can AI-Assisted Data Extraction From HTA Reports Improve Comparative HTA Research: A Case Study on NICE Assessment Reports

Author(s)

Jan-Willem Versteeg, MSc, PharmD1, Marie De Bruin, PhD1, Maarten Schermer, Drs.1, Shiva Nadi Najafabadi, MSc1, Modhurita Mitra, PhD1, Christine Leopold, PhD1, Aukje Mantel-Teeuwisse, PhD1, Wim Goettsch, MSc, PhD2, Lourens Bloem, PhD1.
1Utrecht University, Utrecht, Netherlands, 2Zorginstituut Nederland, Diemen, Netherlands.
OBJECTIVES: Data used in comparative health technology assessment (HTA) research is often manually extracted from HTA reports. This hinders the scope, reproducibility, updateability, and credibility of this research. This study examines the application of automated data extraction methods to extract research-relevant attributes from publicly available HTA reports. This study analyzes and compares the performance of various text-mining techniques, aiming to demonstrate the relevance and opportunities of these extraction methods.
METHODS: To analyze the performance of different text-mining approaches, 14 research-relevant attributes were extracted from National Institute for Health and Care Excellence (NICE) HTA reports using two natural language processing techniques (rule-based (NLP-R), classification models (NLP-CM)) and a generative AI technique (large language model-based (LLM), Claude 3 Opus). To analyze the performance of the extraction methods, accuracy and other method-specific measures were calculated and compared. Additionally, data extracted using the LLM-based extraction was analyzed for policy insights.
RESULTS: Extraction accuracies depended on the extraction method and attribute. Overall, the LLM-based approach performed best (88-98% accuracy for 12/14 attributes). Extraction of the outcome of the relative effectiveness assessment (REA) and the comparator was most challenging and had the lowest accuracies (~70% for the LLM-based approach). NLP-based methods required more development work and were unable to extract attributes at the medicine-indication combination level; however, they were independent of commercial software and free from reproducibility issues, which were the most significant limitations of the LLM-based approach. Graphs created using the LLM-extracted data give important policy insights that are updateable and reproducible and would have been difficult to obtain with manual data extraction.
CONCLUSIONS: Automatic data extraction for research-relevant attributes from HTA reports is possible and can provide important insights for comparative HTA research. Room for improvement remains, and future research should focus on expanding the system to different HTA organizations and refining the LLM-based approach.

Conference/Value in Health Info

2025-11, ISPOR Europe 2025, Glasgow, Scotland

Value in Health, Volume 28, Issue S2

Code

P2

Topic

Health Technology Assessment, Methodological & Statistical Research, Study Approaches

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×