Can Large Language Models Simulate HTA Committee Discussions? Findings and Challenges from a Case Study in Neoadjuvant Treatment of Resectable Non-Small Cell Lung Cancer

Author(s)

Reason T¹, Klijn S², Gimblett A³, Malcolm B⁴
¹Estima Scientific Ltd, South Ruislip, LON, UK, ²Bristol-Myers Squibb, Utrecht, ZH, Netherlands, ³Estima Scientific Ltd, London, UK, ⁴Bristol Myers Squibb, Uxbridge, LON, UK

Presentation Documents

ISPOR24_Reason_P46_PRESENTATION138695.pdf

OBJECTIVES: Health Technology Assessment (HTA) committees play a crucial role in evaluating reimbursement dossiers for healthcare interventions for the routine use of emerging technologies and interventions. These committees comprise members with vast amounts of expertise whose knowledge is not readily available to pharmaceutical manufacturers.

METHODS: We developed a Large Language Model (LLM) based simulation in Python using GPT-4 Turbo to replicate an HTA committee discussion, using a real Economic Assessment Group (EAG) report in non small cell lung cancer (NSCLC) as a reference document. The virtual committee comprised a fixed number of members with varying categorical attributes, including Health Economics and Outcomes Research (HEOR) knowledge, attitudes towards the pharmaceutical industry, occupations and personal perspectives. These attributes were programmatically modified to generate a range of virtual personalities. The LLM facilitated the committee discussion, with each member contributing and continuing the discussion based on their predefined characteristics. Finally, a chair simulated by the LLM (deterministically), summarised the discussions and formulated a final recommendation on the healthcare intervention under review.

RESULTS: The LLM demonstrated capability in generating realistic and coherent committee discussions. Virtual members maintained distinct and consistent personalities, contributing perspectives aligned with their assigned attributes. However it was difficult to sustain seeds of disagreement between members who tended to converge on consensus towards recommending products. The virtual committee chair effectively summarised discussions and made recommendations that were coherent with the rest of the virtual discussion.

CONCLUSIONS: This study highlights the potential and limitations of using LLMs to simulate HTA committee discussions. While LLMs show promise in replicating realistic committee dynamics and maintaining diversity in accordance with distinct member characteristics, further refinement is needed to enhance focus specificity. This approach paves the way for future research in AI applications for training, policy analysis, and exploring decision-making processes requiring committee approval in healthcare settings.

Conference/Value in Health Info

2024-05, ISPOR 2024, Atlanta, GA, USA

Value in Health, Volume 27, Issue 6, S1 (June 2024)

Acceptance Code

P46

Topic

Health Technology Assessment, Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Decision & Deliberative Processes

Disease

no-additional-disease-conditions-specialized-treatment-areas, Oncology

Explore Related HEOR by Topic

Presentation (Paper)