AI AGENT FOR AUTOMATED QUALITY CHECK OF MS EXCEL BASED COST-EFFECTIVENESS MODELS
Author(s)
Tushar Srivastava, MSc1, Hanan Irfan, MSc2, Kunal Swami, MASc, MSc2, Vikas Badola, BTech2, Shilpi Swami, MSc1;
1ConnectHEOR, London, United Kingdom, 2ConnectHEOR, Delhi, India
1ConnectHEOR, London, United Kingdom, 2ConnectHEOR, Delhi, India
OBJECTIVES: Quality control (QC) of Excel-based cost-effectiveness (CE) models is essential for HTA credibility but remains manual, time-consuming, and inconsistently documented. We evaluated an AI-driven QC system that ingests CE models in Excel and executes a predefined QC checklist and compared the performance against human validators.
METHODS: The AI agent for QC interprets each checklist item to infer the expected behavior, determines applicability to the model type, and uses a reasoning layer to generate a stepwise plan (identify worksheets/ranges, extract values, compare outputs, perform independent calculations). An executor layer then performs static data extractions and dynamic changes (e.g., changing dropdowns/parameters, recalculating) to validate response patterns. For each check, the tool outputs pass/fail, supporting observations, and corrective recommendations. The agent was tested using two complex models: a Markov model and a Partitioned Survival Model (PSM). To evaluate performance, 20 errors were manually seeded across both models, ranging in difficulty.
RESULTS: In the Markov model, the agent identified 20/20 (100%) seeded errors, including a complex logic error where state-transition probabilities did not sum to one under specific subgroup scenarios. In the PSM, the tool detected 19/20 (95%) errors, successfully identifying a mismatch between the survival function inputs and the extrapolated area-under-the-curve (AUC) calculations. Notably, the agent tested complex logic that required model recalculation, a task typically challenging for even human auditors. In contrast,"human" review identified only 80% of errors . The automated QC process was completed in <2 hours per model, representing a >90% reduction in time compared to manual review.
CONCLUSIONS: Across Markov and PSM case studies with controlled fault injection, an AI-driven, reasoning-based QC approach demonstrated strong accuracy and efficiency gains. These findings support the potential role of agentic AI systems in delivering scalable, transparent, and reproducible QC for Excel-based CE models, complementing traditional expert review in HTA workflows.
METHODS: The AI agent for QC interprets each checklist item to infer the expected behavior, determines applicability to the model type, and uses a reasoning layer to generate a stepwise plan (identify worksheets/ranges, extract values, compare outputs, perform independent calculations). An executor layer then performs static data extractions and dynamic changes (e.g., changing dropdowns/parameters, recalculating) to validate response patterns. For each check, the tool outputs pass/fail, supporting observations, and corrective recommendations. The agent was tested using two complex models: a Markov model and a Partitioned Survival Model (PSM). To evaluate performance, 20 errors were manually seeded across both models, ranging in difficulty.
RESULTS: In the Markov model, the agent identified 20/20 (100%) seeded errors, including a complex logic error where state-transition probabilities did not sum to one under specific subgroup scenarios. In the PSM, the tool detected 19/20 (95%) errors, successfully identifying a mismatch between the survival function inputs and the extrapolated area-under-the-curve (AUC) calculations. Notably, the agent tested complex logic that required model recalculation, a task typically challenging for even human auditors. In contrast,"human" review identified only 80% of errors . The automated QC process was completed in <2 hours per model, representing a >90% reduction in time compared to manual review.
CONCLUSIONS: Across Markov and PSM case studies with controlled fault injection, an AI-driven, reasoning-based QC approach demonstrated strong accuracy and efficiency gains. These findings support the potential role of agentic AI systems in delivering scalable, transparent, and reproducible QC for Excel-based CE models, complementing traditional expert review in HTA workflows.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
MSR146
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
No Additional Disease & Conditions/Specialized Treatment Areas