AI-Driven Simulation of FDA DCOA Review to Inform Evidence Planning and Regulatory Strategy
Author(s)
Rajdeep Kaur, PhD1, Sarah Donelson, MA2, Matthew Dixon, PharmD, PhD3, Barinder Singh, RPh1, Shubhram Pandey, MSc1, Nicola Waddell, HNC4, Siguroli Teitsson, BSc, MSc5.
1Pharmacoevidence Pvt. Ltd., Mohali, India, 2Bristol Myers Squibb, Novato, CA, USA, 3Bristol Myers Squibb, Princeton, NJ, USA, 4Pharmacoevidence Pvt. Ltd., London, United Kingdom, 5Bristol Myers Squibb, London, United Kingdom.
1Pharmacoevidence Pvt. Ltd., Mohali, India, 2Bristol Myers Squibb, Novato, CA, USA, 3Bristol Myers Squibb, Princeton, NJ, USA, 4Pharmacoevidence Pvt. Ltd., London, United Kingdom, 5Bristol Myers Squibb, London, United Kingdom.
OBJECTIVES: The U.S. Food and Drug Administration (FDA) requires robust, evidence-based Clinical Outcome Assessments (COAs) to evaluate patient-focused evidence and demonstrate treatment benefit. The goal of this study was to develop a Retrieval-Augmented Generation (RAG) based, agent-driven approach to support early evaluation of COA strategies and identify evidence gaps using FDA guidance-aligned regulatory feedback.
METHODS: The interface integrated a multi-agent architecture with a RAG framework to simulate engagement between the sponsor and FDA during COA evaluation. Multiple AI agents were developed, each assigned a specific role reflecting that of an FDA reviewer: Division COA (DCOA) Lead, PFSS (Patient Focused Statistical Scientist) Team Leader, Statistician, Medical Officer, and Clinical Team Leader. Relevant documents from the sponsor and FDA guidance’s (e.g., FDA’s 2009 Guidance for Industry, PFDD (Patient Focused Drug Development), Guidance series, etc.) were uploaded, and their embeddings were stored. The tool was evaluated in two therapeutic areas using regulatory questions on conceptual model, content validity, and psychometric adequacy. Subject matter experts (SMEs) reviewed responses of AI agents using a binary scoring system, 1 assigned if the output was factually accurate and aligned with relevant regulatory guidance, and 0 otherwise.
RESULTS: The interface successfully simulated regulatory responses as determined by SMEs who assessed accuracy and alignment with regulatory guidance. In the first therapeutic area, SMEs evaluated nine regulatory questions and confirmed complete alignment, resulting in a score of 9 out of 9. Similarly in the second therapeutic area, all three responses were evaluated, resulting score of 3 out of 3. These findings indicate strong concordance between the AI-generated outputs and regulatory expectations across both domains.
CONCLUSIONS: By aligning agent evaluations with sponsor evidence and regulatory guidance, the system enables early detection of evidentiary gaps and improves submission readiness. Further research is needed to test applicability across disease areas.
METHODS: The interface integrated a multi-agent architecture with a RAG framework to simulate engagement between the sponsor and FDA during COA evaluation. Multiple AI agents were developed, each assigned a specific role reflecting that of an FDA reviewer: Division COA (DCOA) Lead, PFSS (Patient Focused Statistical Scientist) Team Leader, Statistician, Medical Officer, and Clinical Team Leader. Relevant documents from the sponsor and FDA guidance’s (e.g., FDA’s 2009 Guidance for Industry, PFDD (Patient Focused Drug Development), Guidance series, etc.) were uploaded, and their embeddings were stored. The tool was evaluated in two therapeutic areas using regulatory questions on conceptual model, content validity, and psychometric adequacy. Subject matter experts (SMEs) reviewed responses of AI agents using a binary scoring system, 1 assigned if the output was factually accurate and aligned with relevant regulatory guidance, and 0 otherwise.
RESULTS: The interface successfully simulated regulatory responses as determined by SMEs who assessed accuracy and alignment with regulatory guidance. In the first therapeutic area, SMEs evaluated nine regulatory questions and confirmed complete alignment, resulting in a score of 9 out of 9. Similarly in the second therapeutic area, all three responses were evaluated, resulting score of 3 out of 3. These findings indicate strong concordance between the AI-generated outputs and regulatory expectations across both domains.
CONCLUSIONS: By aligning agent evaluations with sponsor evidence and regulatory guidance, the system enables early detection of evidentiary gaps and improves submission readiness. Further research is needed to test applicability across disease areas.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
MSR24
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
No Additional Disease & Conditions/Specialized Treatment Areas