Evaluating AI Chatbot Empathy and Probing Skills in Automated Patient In-Trial Interviews: A Proof of Principle Experiment
Author(s)
Bill Byrom, PhD1, Tina Byrom, PhD2.
1Vice President, eCOA Science, Signant Health, Nottingham, United Kingdom, 2Head of Enhanced Academic Practice, Loughborough University, Loughborough, United Kingdom.
1Vice President, eCOA Science, Signant Health, Nottingham, United Kingdom, 2Head of Enhanced Academic Practice, Loughborough University, Loughborough, United Kingdom.
OBJECTIVES: In-trial interviews are increasingly used in pharmaceutical clinical trials to collect complementary insights in areas including content validity evidence, meaningful change, and trial participation experience, but are resource-intensive to conduct. This experiment evaluated the feasibility of using an AI chatbot to conduct qualitative in-trial interviews, with special attention to the ability to follow qualitative interview best practices.
METHODS: A proof-of-principle experiment was conducted using Claude 3.5 Sonnet (Anthropic, October 2024) during simulated patient exit interviews. The AI system was provided with contextual information including trial synopsis, patient profile, COAs, interview objectives, and qualitative interview best practice guidelines. The chatbot was tasked to conduct a mock interview and report anonymized transcripts and summary findings. Assessment criteria focused on the AI's ability to demonstrate contextually appropriate empathetic responses, employ effective probing techniques, and follow interview best practices.
RESULTS: The AI successfully conducted an interview exploring the participant's trial experience and symptom assessment. The chatbot demonstrated ability to provide an appropriate friendly welcome, use probing questions to explore symptom impact (e.g., "Could you tell me more about how the shortness of breath affected you in your daily life?"), show empathy during conversation (e.g., "That sounds like quite a journey for each visit⋯"), and correctly interpret colloquialisms (e.g., "eat like a horse" for good appetite). Areas requiring greater finesse in some places included: (a) using open questions followed by individual probes rather than listing examples up-front, (b) making unqualified assumptions about patient feelings in empathetic responses, and (c) asking consecutive questions together.
CONCLUSIONS: This experiment demonstrates the potential of AI chatbot technology to efficiently administer and report qualitative in-trial interviews. While more finesse and training is required, and evaluation across different cultures and languages, this approach could enable more extensive, cost-effective use of in-trial interviews at scale across larger study populations.
METHODS: A proof-of-principle experiment was conducted using Claude 3.5 Sonnet (Anthropic, October 2024) during simulated patient exit interviews. The AI system was provided with contextual information including trial synopsis, patient profile, COAs, interview objectives, and qualitative interview best practice guidelines. The chatbot was tasked to conduct a mock interview and report anonymized transcripts and summary findings. Assessment criteria focused on the AI's ability to demonstrate contextually appropriate empathetic responses, employ effective probing techniques, and follow interview best practices.
RESULTS: The AI successfully conducted an interview exploring the participant's trial experience and symptom assessment. The chatbot demonstrated ability to provide an appropriate friendly welcome, use probing questions to explore symptom impact (e.g., "Could you tell me more about how the shortness of breath affected you in your daily life?"), show empathy during conversation (e.g., "That sounds like quite a journey for each visit⋯"), and correctly interpret colloquialisms (e.g., "eat like a horse" for good appetite). Areas requiring greater finesse in some places included: (a) using open questions followed by individual probes rather than listing examples up-front, (b) making unqualified assumptions about patient feelings in empathetic responses, and (c) asking consecutive questions together.
CONCLUSIONS: This experiment demonstrates the potential of AI chatbot technology to efficiently administer and report qualitative in-trial interviews. While more finesse and training is required, and evaluation across different cultures and languages, this approach could enable more extensive, cost-effective use of in-trial interviews at scale across larger study populations.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
PT18
Topic
Methodological & Statistical Research, Patient-Centered Research, Study Approaches
Topic Subcategory
Patient-reported Outcomes & Quality of Life Outcomes
Disease
No Additional Disease & Conditions/Specialized Treatment Areas