OPENING THE BLACK BOX: A HUMAN-GOVERNED AGENTIC AI FRAMEWORK FOR HEALTH ECONOMIC MODELING
Author(s)
Haidong Feng, MPH, MS1, Augustine Annan, PhD2, Hannah Paek, BA3, Meng Li, MS, PhD4, Xiaoyan Wang, PhD5;
1Merck, Boston, MA, USA, 2NouStarX, Stafford, TX, USA, 3Binghamton University, Vestal, NY, USA, 4Tufts Medical Center, The Center for the Evaluation of Value and Risk in Health, Boston, MA, USA, 5Tulane University, New Orleans, LA, USA
1Merck, Boston, MA, USA, 2NouStarX, Stafford, TX, USA, 3Binghamton University, Vestal, NY, USA, 4Tufts Medical Center, The Center for the Evaluation of Value and Risk in Health, Boston, MA, USA, 5Tulane University, New Orleans, LA, USA
OBJECTIVES: The application of agentic AI in health economic modeling remains limited by concerns regarding transparency, reproducibility, and insufficient human oversight. We propose a human-governed framework that integrates multi-layer agents to enhance analytic efficiency while maintaining methodological control, evaluated for survival analysis with external real-world validation and model parameterization via AI-enabled evidence synthesis.
METHODS: We developed a five-layer, modular, agentic AI architecture that combines retrieval-augmented generation with deterministic statistical computation under explicit human governance. The framework comprises: (1) a MDP orchestrator defining model structure and analytic workflows in accordance with NICE guidance; (2) a data and evidence executor agent performing auditable tasks including real-world evidence synthesis, individual patient data reconstruction, and survival modeling; (3) a modeling and analysis agent executing the economic modeling including scenario analyses; (4) a validation and optimization agent evaluating outputs against statistical, clinical, and real-world plausibility criteria and refines model specifications; and (5) a reporting agent generating HTA-compliant documentation and visualizations. The KEYNOTE-024 trial was used to benchmark survival analysis and evidence-based parameterization workflows. Two human experts independently evaluated all critical checkpoints of agent-generated outputs.
RESULTS: The agentic framework achieved a 99.94% reduction in analysis time, completing survival analyses in 17 minutes with a total 2.5 hours completion including structured human validation, compared with a traditional 2-3 week workflow. Kaplan-Meier digitization closely matched published results, including 6-month OS for pembrolizumab (80.4% vs 80.2%) and chemotherapy (73.0% vs 72.4%). Reconstructed treatment effects were consistent with trial estimates (OS HR 0.61 vs 0.60; >95% CI overlap). The validator agent excluded 36 of 84 clinically implausible models, confirmed by experts. Survival extrapolations were externally validated against real-world data with consistency.
CONCLUSIONS: A human-governed agentic AI framework can markedly accelerate health economic modeling while maintaining transparency, reproducibility, and HTA-aligned methodological rigor.
METHODS: We developed a five-layer, modular, agentic AI architecture that combines retrieval-augmented generation with deterministic statistical computation under explicit human governance. The framework comprises: (1) a MDP orchestrator defining model structure and analytic workflows in accordance with NICE guidance; (2) a data and evidence executor agent performing auditable tasks including real-world evidence synthesis, individual patient data reconstruction, and survival modeling; (3) a modeling and analysis agent executing the economic modeling including scenario analyses; (4) a validation and optimization agent evaluating outputs against statistical, clinical, and real-world plausibility criteria and refines model specifications; and (5) a reporting agent generating HTA-compliant documentation and visualizations. The KEYNOTE-024 trial was used to benchmark survival analysis and evidence-based parameterization workflows. Two human experts independently evaluated all critical checkpoints of agent-generated outputs.
RESULTS: The agentic framework achieved a 99.94% reduction in analysis time, completing survival analyses in 17 minutes with a total 2.5 hours completion including structured human validation, compared with a traditional 2-3 week workflow. Kaplan-Meier digitization closely matched published results, including 6-month OS for pembrolizumab (80.4% vs 80.2%) and chemotherapy (73.0% vs 72.4%). Reconstructed treatment effects were consistent with trial estimates (OS HR 0.61 vs 0.60; >95% CI overlap). The validator agent excluded 36 of 84 clinically implausible models, confirmed by experts. Survival extrapolations were externally validated against real-world data with consistency.
CONCLUSIONS: A human-governed agentic AI framework can markedly accelerate health economic modeling while maintaining transparency, reproducibility, and HTA-aligned methodological rigor.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
P4
Topic
Health Technology Assessment
Topic Subcategory
Systems & Structure
Disease
SDC: Oncology