Enhancing Health Technology Assessment Accessibility: Using ChatGPT Prompts to Streamline Efficiency Frontiers, Cost-Effectiveness and Net Benefit Analyses
Author(s)
Bruno M. Barros1, Alex Itaborahy, Student2, Marcelo Correia, Researcher2, Bernardo Rangel Tura, Researcher2, Carlos Magliano, Sr., PhD2;
1Instituto Nacional de Cardiologia, Student, Rio de Janeiro, Brazil, 2National Institute of Cardiology, Rio de Janeiro, Brazil
1Instituto Nacional de Cardiologia, Student, Rio de Janeiro, Brazil, 2National Institute of Cardiology, Rio de Janeiro, Brazil
OBJECTIVES: Health technology assessment (HTA) often involves complex cost-effectiveness analyses (CEA) that can be challenging for non-experts. This study evaluates the use of AI-powered prompts to streamline CEA processes, particularly in generating efficiency frontiers and net benefit analyses. By simplifying these processes, this approach aims to empower non-modeling HTA specialists and decision-makers to understand therapeutic scenarios and make informed adjustments to economic evaluations.
METHODS: Custom prompts were developed using ChatGPT 4o and tested for usability and reproducibility in ChatGPT 4o (paid version) and 4o mini (free version). These prompts automated key steps of economic analysis, including calculating net monetary and health benefits (NMB and NHB), applying dominance and extended dominance principles, and generating efficient frontier plots. Fifteen HTA professionals without prior modeling experience tested the prompts using predefined hypothetical datasets. Results from both ChatGPT versions were compared to expected outputs, and usability and accuracy were evaluated.
RESULTS: ChatGPT 4o achieved 93,3% accuracy in NMB and NHB calculations and correctly applied dominance and extended dominance principles in 80% of cases. Incremental cost-effectiveness ratios (ICERs) were calculated for non-dominated therapies in 60% of scenarios and generated efficiency frontier plots in 53,3% of the tests. In contrast, ChatGPT 4o mini showed 60% accuracy for NMB and NHB calculations, 26,7% for applying dominance and extended dominance, and failed to calculate ICERs or generate frontier plots as expected.
CONCLUSIONS: AI-driven prompts can enhance the accessibility of cost-effectiveness analyses, enabling non-technical stakeholders to develop and interpret efficiency frontiers and benefit analyses. ChatGPT 4o demonstrated superior reliability compared to version 4o mini, particularly in calculations and graphical outputs. However, limitations in ChatGPT 4o mini version underscore the need for tailored adaptations for different AI platforms. Future developments could improve the robustness and usability of these tools in HTA applications.
METHODS: Custom prompts were developed using ChatGPT 4o and tested for usability and reproducibility in ChatGPT 4o (paid version) and 4o mini (free version). These prompts automated key steps of economic analysis, including calculating net monetary and health benefits (NMB and NHB), applying dominance and extended dominance principles, and generating efficient frontier plots. Fifteen HTA professionals without prior modeling experience tested the prompts using predefined hypothetical datasets. Results from both ChatGPT versions were compared to expected outputs, and usability and accuracy were evaluated.
RESULTS: ChatGPT 4o achieved 93,3% accuracy in NMB and NHB calculations and correctly applied dominance and extended dominance principles in 80% of cases. Incremental cost-effectiveness ratios (ICERs) were calculated for non-dominated therapies in 60% of scenarios and generated efficiency frontier plots in 53,3% of the tests. In contrast, ChatGPT 4o mini showed 60% accuracy for NMB and NHB calculations, 26,7% for applying dominance and extended dominance, and failed to calculate ICERs or generate frontier plots as expected.
CONCLUSIONS: AI-driven prompts can enhance the accessibility of cost-effectiveness analyses, enabling non-technical stakeholders to develop and interpret efficiency frontiers and benefit analyses. ChatGPT 4o demonstrated superior reliability compared to version 4o mini, particularly in calculations and graphical outputs. However, limitations in ChatGPT 4o mini version underscore the need for tailored adaptations for different AI platforms. Future developments could improve the robustness and usability of these tools in HTA applications.
Conference/Value in Health Info
2025-05, ISPOR 2025, Montréal, Quebec, CA
Value in Health, Volume 28, Issue S1
Code
MSR108
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
STA: Biologics & Biosimilars