Automating Economic Modelling: Potential of Generative AI for Updating Excel-Based Cost-Effectiveness Models


Rawlinson W1, Klijn S2, Teitsson S3, Malcolm B3, Gimblett A1, Reason T4
1Estima Scientific Ltd, London, UK, 2Bristol-Myers Squibb, Utrecht, ZH, Netherlands, 3Bristol Myers Squibb, Uxbridge, UK, 4Estima Scientific Ltd, South Ruislip, LON, UK

OBJECTIVES: Using large language models (LLMs) such as Generative Pre-trained Transformer 4 (GPT-4) to edit Microsoft Excel files could revolutionize the way we interact with health economic models. The aim of this study was to assess the accuracy and capability of GPT-4 in automating the adjustment of an HTA-ready Excel cost-effectiveness model (CEM) for muscle-invasive urothelial carcinoma (MIUC) from the setting of one country to another.

METHODS: This adaptation, conducted by humans, was submitted to HTA authorities globally who deemed the model appropriate for decision making. For this case study, GPT-4 was used to adapt the MIUC model from a UK base case to a Czech Republic perspective. Prior to conducting the study, the model received minor updates to improve its interpretability, such as clarifying vague descriptive text. GPT-4 was then provided with natural language instructions and tabular data that described adaptations in a human-oriented manner (without the use of cell references). Based on this, GPT-4 automatically updated input values in the Excel model without human intervention. All edits made by GPT-4 were highlighted, enhancing subsequent review by a health economist. Accuracy was measured by a human checking whether all required adaptations had been performed and whether all updates performed by GPT-4 were correct.

RESULTS: The AI-generated adaptations were performed in 245 seconds. GPT-4 performed 62/64 required updates, and 100% of these updates were performed correctly. This resulted in an overall accuracy score of 97% (adverse event costs, 100% [7/7]; model settings, 100% [2/2]; drug acquisition and administration costs, 82% [9/11]; resource costs, 100% [32/32]; subsequent treatment proportions 100% [12/12]).

CONCLUSIONS: This study demonstrates the technical feasibility of using LLMs to automate the editing of Excel-based CEMs. Given that models are set up clearly, this is a promising early indication that highly accurate edits of input values can be achieved.