Leveraging Generative Artificial Intelligence for the Creation of Global Value Dossiers Through a RAG Pipeline and Multi-Agent Integration
Author(s)
Sven L. Klijn, MSc1, Alison Johnson, PhD1, Sahana Joish, MBA1, Barinder Singh, RPh2, Shubhram Pandey, MSc2, Nicola Waddell, HNC3, Rajdeep Kaur2.
1Bristol Myers Squibb, Princeton, NJ, USA, 2Pharmacoevidence Pvt. Ltd., Mohali, India, 3Pharmacoevidence Pvt. Ltd., London, United Kingdom.
1Bristol Myers Squibb, Princeton, NJ, USA, 2Pharmacoevidence Pvt. Ltd., Mohali, India, 3Pharmacoevidence Pvt. Ltd., London, United Kingdom.
OBJECTIVES: This study aimed to evaluate the feasibility of automating the generation of various global value dossier (GVD) sections through generative artificial intelligence (GenAI). The goal was to combine data preprocessing of reference materials with data standardization, Retrieval-Augmented Generation (RAG) pipelines, and a multi-agent approach to produce accurate, traceable outputs with human oversight.
METHODS: In Phase 1, different input data file formats were processed using Optical Character Recognition (OCR) and standardized into markdown format. In Phase 2, agents generated value messages based on contextual evidence stored in the RAG. These were reviewed and validated by subject matter experts (SMEs) and automatically mapped into the appropriate GVD sections. In Phase 3, separate agents were configured to generate the different GVD sections. Generated outputs were validated by SMEs for completeness, clarity, accuracy, and traceability.
RESULTS: A total of 140 documents were uploaded into the RAG pipeline, generating a 73-page GVD comprising disease background, disease management, and unmet needs. Output included tables and visualizations, such as bar graphs, pie charts, and line graphs, without the need for human intervention. Kaplan-Meier and forest plots required manual intervention due to their statistical intricacies. Human input was required for approximately 5% of the disease background, 10% of the disease management, and 1% of the unmet need’s sections, primarily to assist with formatting. The AI-generated GVD was assessed by SMEs for completeness, formatting, and traceability of data points, confirming accuracy of the output. The AI+human process resulted in 70-80% time savings compared to a human-only process.
CONCLUSIONS: This study demonstrates the feasibility of leveraging GenAI for parts of the GVD creation process, changing the GVD development timeline from weeks/months to days, while retaining accuracy and traceability. Further research is required to evaluate generalizability.
METHODS: In Phase 1, different input data file formats were processed using Optical Character Recognition (OCR) and standardized into markdown format. In Phase 2, agents generated value messages based on contextual evidence stored in the RAG. These were reviewed and validated by subject matter experts (SMEs) and automatically mapped into the appropriate GVD sections. In Phase 3, separate agents were configured to generate the different GVD sections. Generated outputs were validated by SMEs for completeness, clarity, accuracy, and traceability.
RESULTS: A total of 140 documents were uploaded into the RAG pipeline, generating a 73-page GVD comprising disease background, disease management, and unmet needs. Output included tables and visualizations, such as bar graphs, pie charts, and line graphs, without the need for human intervention. Kaplan-Meier and forest plots required manual intervention due to their statistical intricacies. Human input was required for approximately 5% of the disease background, 10% of the disease management, and 1% of the unmet need’s sections, primarily to assist with formatting. The AI-generated GVD was assessed by SMEs for completeness, formatting, and traceability of data points, confirming accuracy of the output. The AI+human process resulted in 70-80% time savings compared to a human-only process.
CONCLUSIONS: This study demonstrates the feasibility of leveraging GenAI for parts of the GVD creation process, changing the GVD development timeline from weeks/months to days, while retaining accuracy and traceability. Further research is required to evaluate generalizability.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
MSR136
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
No Additional Disease & Conditions/Specialized Treatment Areas