GOVERNING RWD- AND GENAI-ENABLED RWE: PRACTICAL, AUDITABLE CONTROLS FOR FASTER—YET TRUSTWORTHY—EVIDENCE GENERATION
Author(s)
Sherrine Eid, BS, MPH;
SAS Institute, Global Head, Epidemiology, RWE & Observational Research, Macungie, PA, USA
SAS Institute, Global Head, Epidemiology, RWE & Observational Research, Macungie, PA, USA
OBJECTIVES: Real-world data (RWD) and modern AI (including large language models [LLMs]) are rapidly compressing the time from question-to-insight in real-world evidence (RWE) generation. Yet the same capabilities introduce new risks—data provenance ambiguity, hidden confounding, automation bias, privacy leakage, and non-reproducible pipelines—that can erode confidence with regulators, HTA bodies, and clinicians. We propose a practical governance blueprint that treats “responsible intelligence generation” as an end-to-end lifecycle spanning data acquisition, phenotyping, analysis, reporting, and ongoing monitoring.
METHODS: Our approach operationalizes risk-based governance using the NIST AI Risk Management Framework (and its GenAI profile) to define measurable controls across governance, mapping, measurement, and management functions, aligned with an AI management system (ISO/IEC 42001) for organizational accountability.
RESULTS: For RWD fitness-for-purpose, we recommend standardized data models and automated quality assessment (e.g., OMOP CDM with OHDSI’s Data Quality Dashboard), coupled with explicit data provenance, lineage, and access controls. For regulatory-grade traceability, we align feasibility and observational workflows to FDA’s RWD/RWE considerations (including documentation of data relevance/reliability, protocol-driven analyses, and audit-ready records). For AI/LLM components, we specify controls that are increasingly expected in high-stakes settings: model and prompt versioning; retrieval-augmented generation (RAG) with curated, citable source corpora; structured human-in-the-loop adjudication for NLP-derived phenotypes; bias and drift monitoring; red-teaming and incident response; and reproducible reporting using CONSORT-AI/SPIRIT-AI (for AI interventions) and TRIPOD+AI (for prediction models). Finally, we embed transparency norms from ISPOR/ISPE RWE good practices (study registration, replicability, and clear analytic disclosure) to strengthen credibility across stakeholders.
CONCLUSIONS: This blueprint enables organizations to move faster with AI while remaining auditable, patient-protective, and decision-grade.
METHODS: Our approach operationalizes risk-based governance using the NIST AI Risk Management Framework (and its GenAI profile) to define measurable controls across governance, mapping, measurement, and management functions, aligned with an AI management system (ISO/IEC 42001) for organizational accountability.
RESULTS: For RWD fitness-for-purpose, we recommend standardized data models and automated quality assessment (e.g., OMOP CDM with OHDSI’s Data Quality Dashboard), coupled with explicit data provenance, lineage, and access controls. For regulatory-grade traceability, we align feasibility and observational workflows to FDA’s RWD/RWE considerations (including documentation of data relevance/reliability, protocol-driven analyses, and audit-ready records). For AI/LLM components, we specify controls that are increasingly expected in high-stakes settings: model and prompt versioning; retrieval-augmented generation (RAG) with curated, citable source corpora; structured human-in-the-loop adjudication for NLP-derived phenotypes; bias and drift monitoring; red-teaming and incident response; and reproducible reporting using CONSORT-AI/SPIRIT-AI (for AI interventions) and TRIPOD+AI (for prediction models). Finally, we embed transparency norms from ISPOR/ISPE RWE good practices (study registration, replicability, and clear analytic disclosure) to strengthen credibility across stakeholders.
CONCLUSIONS: This blueprint enables organizations to move faster with AI while remaining auditable, patient-protective, and decision-grade.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
MSR90
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
No Additional Disease & Conditions/Specialized Treatment Areas