USER EVALUATION OF AN AGENTIC AI ASSISTANT FOR REAL-WORLD EVIDENCE FEASIBILITY ANALYSIS
Author(s)
Angela Watkins, MBA, MPH1, Amar Das, PhD, MD2, Brandon Theodorou, PhD3, Jimeng Sun, PhD3;
1Guardant Health, Oklahoma City, OK, USA, 2Guardant Health, Pasadena, CA, USA, 3Keiji.AI, Seattle, WA, USA
1Guardant Health, Oklahoma City, OK, USA, 2Guardant Health, Pasadena, CA, USA, 3Keiji.AI, Seattle, WA, USA
OBJECTIVES: Guardant Health is facing growing need to scale RWE generation using GuardantINFORM, an IRB-approved clinic-genomic database. To address this gap, Guardant and Keiji jointly developed an agentic AI assistant, named Elsie, to automate cohort design, statistical computation, and analytic code generation through a conversational, no-code interface. The analytic capabilities of Elsie were trained on 20 curated feasibility projects. Elsie supports multi-step reasoning through natural language queries and executes code through Python, SQL, or R. Guardant and Keiji conducted weekly training and evaluation cycles incorporating user feedback to enhance Elsie’s accuracy, completeness, and interpretability. To assess usability and workflow design for novel users, a structured UAT was performed.To evaluate usability, perceived trust, and workflow design of an agentic AI assistant for real-world evidence (RWE) feasibility analyses using structured user acceptance testing (UAT).
METHODS: Eight non-technical users were invited to complete a 60‑minute UAT session. Users generated a chat session and completed two standard and one self-generated feasibility analysis. Facilitators used a scripted think‑aloud protocol with transcribed notes. Afterward, users provided feedback on design, speed, and accuracy. A standard system usability scale (SUS) was administered to quantitatively score usability.
RESULTS: All eight users successfully completed the three core tasks. Numeric results were aligned with prior GuardantINFORM‑based analyses. The mean SUS score was 73, indicating acceptable usability. Item-level SUS responses had the highest rating for very strong desire to use and the lowest for perceived need for support. Users described the interaction with Elsie as “user friendly,” and “simple and straightforward”. Most noted that Elsie’s performance was fast or reasonable.
CONCLUSIONS: Structured UAT showed that an agentic AI assistant layered on existing RWE infrastructure can be learned quickly, produces outputs perceived as credible, and fits naturally into feasibility workflows. Next steps include expanding dataset availability and extending testing to additional user groups outside of Guardant.
METHODS: Eight non-technical users were invited to complete a 60‑minute UAT session. Users generated a chat session and completed two standard and one self-generated feasibility analysis. Facilitators used a scripted think‑aloud protocol with transcribed notes. Afterward, users provided feedback on design, speed, and accuracy. A standard system usability scale (SUS) was administered to quantitatively score usability.
RESULTS: All eight users successfully completed the three core tasks. Numeric results were aligned with prior GuardantINFORM‑based analyses. The mean SUS score was 73, indicating acceptable usability. Item-level SUS responses had the highest rating for very strong desire to use and the lowest for perceived need for support. Users described the interaction with Elsie as “user friendly,” and “simple and straightforward”. Most noted that Elsie’s performance was fast or reasonable.
CONCLUSIONS: Structured UAT showed that an agentic AI assistant layered on existing RWE infrastructure can be learned quickly, produces outputs perceived as credible, and fits naturally into feasibility workflows. Next steps include expanding dataset availability and extending testing to additional user groups outside of Guardant.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
RWD52
Topic
Real World Data & Information Systems
Disease
No Additional Disease & Conditions/Specialized Treatment Areas, SDC: Oncology