RESPONSIBLE AI ADOPTION IN HEOR THROUGH HUMAN-IN-THE-LOOP (HITL) FRAMEWORKS ALIGNING WITH GLOBAL HTA EXPECTATIONS
Author(s)
Inderpreet S. Marwaha, MSc, RPh1, Rajdeep Kaur, PhD1, Shubhram Pandey, MSc1, Barinder Singh, RPh2, Gagandeep Kaur, M.Pharm1.
1Pharmacoevidence Pvt. Ltd., SAS Nagar, Mohali, India, 2Pharmacoevidence Pvt. Ltd., SAS Nagar Mohali, India.
1Pharmacoevidence Pvt. Ltd., SAS Nagar, Mohali, India, 2Pharmacoevidence Pvt. Ltd., SAS Nagar Mohali, India.
OBJECTIVES: Despite emerging AI position statements and initial momentum, Generative AI (GenAI) adoption in HEOR remains limited due to inherent uncertainties. To address the lack of practical guidance, we detail the implementation of a comprehensive Human-in-the-Loop (HITL) framework for HEOR workstreams in alignment with global HTA expectations.
METHODS: We implemented a governance architecture integrating mandatory HITL checkpoints across four workflows: (i) Literature screening (GenAI as a second reviewer for inclusion decisions, generating diagnostic confidence scores and rationales), (ii) Data extraction and quality appraisal (GenAI as a second reviewer, or AI as augmentation tool), (iii) Evidence synthesis and reporting (automated reports and global value dossiers [GVDs]), and (iv) Analytical workstreams (automated insights, visualizations and statistical analysis code). AI outputs remained advisory, with predefined and decision rules triggering expert validation before integration into the evidence chain.
RESULTS: Across multiple reviews, literature screening achieved a mean agreement of 95% (range: 89% to 99%), with mandatory human arbitration of low-confidence outputs and discordant decisions affecting 5% of records. While automated extraction and quality appraisal reached expert concordance of 95% and 78% respectively, expert adjudication remained mandatory to validate judgment-based assessments, contextual interpretations, and resolve the AI's conservative risk bias. Furthermore, GenAI-enabled workflows delivered 90% submission-ready GVD outputs and achieved 92% alignment with expert-approved narratives. Expert review was utilized to finalize contextual framing, interpretive accuracy, and adherence to regulatory standards. For analytical workstreams, expert review of AI-generated insights, visual outputs, and statistical code was required by design to ensure defensible claims, logical consistency and alignment with the underlying research methodology.
CONCLUSIONS: This implementation confirms that HITL-enabled GenAI augments rather than replaces expert oversight. By embedding mandatory checkpoints, HEOR teams can harness efficiency while preserving decision provenance and alignment with global HTA expectations. Ultimately, this supports responsible adoption, enabling researchers to focus on high-value activities while ensuring evidence integrity.
METHODS: We implemented a governance architecture integrating mandatory HITL checkpoints across four workflows: (i) Literature screening (GenAI as a second reviewer for inclusion decisions, generating diagnostic confidence scores and rationales), (ii) Data extraction and quality appraisal (GenAI as a second reviewer, or AI as augmentation tool), (iii) Evidence synthesis and reporting (automated reports and global value dossiers [GVDs]), and (iv) Analytical workstreams (automated insights, visualizations and statistical analysis code). AI outputs remained advisory, with predefined and decision rules triggering expert validation before integration into the evidence chain.
RESULTS: Across multiple reviews, literature screening achieved a mean agreement of 95% (range: 89% to 99%), with mandatory human arbitration of low-confidence outputs and discordant decisions affecting 5% of records. While automated extraction and quality appraisal reached expert concordance of 95% and 78% respectively, expert adjudication remained mandatory to validate judgment-based assessments, contextual interpretations, and resolve the AI's conservative risk bias. Furthermore, GenAI-enabled workflows delivered 90% submission-ready GVD outputs and achieved 92% alignment with expert-approved narratives. Expert review was utilized to finalize contextual framing, interpretive accuracy, and adherence to regulatory standards. For analytical workstreams, expert review of AI-generated insights, visual outputs, and statistical code was required by design to ensure defensible claims, logical consistency and alignment with the underlying research methodology.
CONCLUSIONS: This implementation confirms that HITL-enabled GenAI augments rather than replaces expert oversight. By embedding mandatory checkpoints, HEOR teams can harness efficiency while preserving decision provenance and alignment with global HTA expectations. Ultimately, this supports responsible adoption, enabling researchers to focus on high-value activities while ensuring evidence integrity.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
MSR184
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
No Additional Disease & Conditions/Specialized Treatment Areas