Using Artificial Intelligence to Predict Patient's Preferences

Author(s)

Tina Cheng, MPH, Juan M. Gonzalez, PhD, Shelby Reed, RPh, PhD, Semra Ozdemir, PhD;
Duke University, Durham, NC, USA
OBJECTIVES: This study examined the potential of large-language models (LLMs), such as OpenAI's GPT-4, to predict patient preferences for medical treatments based on past choice data.
METHODS: The predictive capabilities of GPT-4 were evaluated using synthetic data derived from real patient choices in a discrete choice experiment (DCE) related to cancer care. The synthetic dataset included 50 patients, each answering 48 questions comparing two treatment options that varied by expected survival, long-term survival, health limitation, and out-of-pocket cost. For each patient, data were split into training and testing sets, where GPT-4 was tasked with predicting the treatment option a patient would choose. Various input conditions were tested, including framing GPT-4 as an “assistant” versus a “patient”, randomizing training and testing set questions, and standardizing outputs. The analysis included quantitative measures (prediction accuracy) and qualitative assessments (semantic and contextual appropriateness of responses).
RESULTS: When identical training and testing questions were used for each respondent, GPT-4 achieved an average prediction accuracy of 70.5% (Standard Deviation [SD] = 7.8%). After randomizing training and testing questions, accuracy remained stable at 69.9% (SD = 10.9%), although greater variability was observed across respondents. This variability likely reflected differences in the informativeness of DCE questions, where some had greater attribute differences between alternatives. Prompt engineering revealed that GPT-4 predictions aligned with expected utility theory, regardless its role (as an assistant or patient) or response depth (simple answer or with reasoning).
CONCLUSIONS: This study demonstrates that LLMs have the potential to effectively predict patient preferences, with accuracy surpassing typical caregiver predictions accuracy of 50%, which is comparable to random guessing. However, further research is needed to evaluate LLM reliability, ability to detect preference heterogeneity, and address ethical considerations. These findings highlight the potential of LLMs to support patient-centered decision-making, particularly for patients who lose decision-making capacity.

Conference/Value in Health Info

2025-05, ISPOR 2025, Montréal, Quebec, CA

Value in Health, Volume 28, Issue S1

Code

MSR31

Topic

Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

SDC: Oncology

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×