AI AGENT FOR AUTOMATED QUALITY CHECK OF MS EXCEL BASED COST-EFFECTIVENESS MODELS
Author(s)
Tushar Srivastava, MSc1, Hanan Irfan, MSc2, Kunal Swami, MASc, MSc2, Vikas Badola, BTech2, Shilpi Swami, MSc1.
1ConnectHEOR, London, United Kingdom, 2ConnectHEOR, Delhi, India.
1ConnectHEOR, London, United Kingdom, 2ConnectHEOR, Delhi, India.
OBJECTIVES: Quality control (QC) of Excel-based cost-effectiveness (CE) models is essential for HTA credibility but remains manual, time-consuming, and inconsistently documented. We evaluated an AI-driven QC system that ingests CE models in Excel and executes a predefined QC checklist and compared the performance against human validators.
METHODS: The AI agent for QC interprets each checklist item to infer the expected behavior, determines applicability to the model type, and uses a reasoning layer to generate a stepwise plan (identify worksheets/ranges, extract values, compare outputs, perform independent calculations). An executor layer then performs static data extractions and dynamic changes (e.g., changing dropdowns/parameters, recalculating) to validate response patterns. For each check, the tool outputs pass/fail, supporting observations, and corrective recommendations. The agent was tested using two complex models: a Markov model and a Partitioned Survival Model (PSM). To evaluate performance, 20 errors were manually seeded across both models, ranging in difficulty.
RESULTS: In the Markov model, the agent identified 20/20 (100%) seeded errors, including a complex logic error where state-transition probabilities did not sum to one under specific subgroup scenarios. In the PSM, the tool detected 19/20 (95%) errors, successfully identifying a mismatch between the survival function inputs and the extrapolated area-under-the-curve (AUC) calculations. Notably, the agent tested complex logic that required model recalculation, a task typically challenging for even human auditors. In contrast,"human" review identified only 80% of errors . The automated QC process was completed in <2 hours per model, representing a >90% reduction in time compared to manual review.
CONCLUSIONS: Across Markov and PSM case studies with controlled fault injection, an AI-driven, reasoning-based QC approach demonstrated strong accuracy and efficiency gains. These findings support the potential role of agentic AI systems in delivering scalable, transparent, and reproducible QC for Excel-based CE models, complementing traditional expert review in HTA workflows.
METHODS: The AI agent for QC interprets each checklist item to infer the expected behavior, determines applicability to the model type, and uses a reasoning layer to generate a stepwise plan (identify worksheets/ranges, extract values, compare outputs, perform independent calculations). An executor layer then performs static data extractions and dynamic changes (e.g., changing dropdowns/parameters, recalculating) to validate response patterns. For each check, the tool outputs pass/fail, supporting observations, and corrective recommendations. The agent was tested using two complex models: a Markov model and a Partitioned Survival Model (PSM). To evaluate performance, 20 errors were manually seeded across both models, ranging in difficulty.
RESULTS: In the Markov model, the agent identified 20/20 (100%) seeded errors, including a complex logic error where state-transition probabilities did not sum to one under specific subgroup scenarios. In the PSM, the tool detected 19/20 (95%) errors, successfully identifying a mismatch between the survival function inputs and the extrapolated area-under-the-curve (AUC) calculations. Notably, the agent tested complex logic that required model recalculation, a task typically challenging for even human auditors. In contrast,"human" review identified only 80% of errors . The automated QC process was completed in <2 hours per model, representing a >90% reduction in time compared to manual review.
CONCLUSIONS: Across Markov and PSM case studies with controlled fault injection, an AI-driven, reasoning-based QC approach demonstrated strong accuracy and efficiency gains. These findings support the potential role of agentic AI systems in delivering scalable, transparent, and reproducible QC for Excel-based CE models, complementing traditional expert review in HTA workflows.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
MSR146
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
No Additional Disease & Conditions/Specialized Treatment Areas