A Layered Approach to Reducing Hallucinations in LLMs for Structured and Unstructured Clinical Data

Author(s)

Ashwin Kumar Rai, MS¹, Devika Bhandary, MSc², Victoria Ikoro, PhD², Andre Ng, MSc².
¹Director of Data Science & Advanced Analytics, Thermo Fisher Scientific, Overland Park, KS, USA, ²Thermo Fisher Scientific, London, United Kingdom.

OBJECTIVES: Large Language Models (LLMs) are increasingly explored to automate literature reviews, summarize patient records, or generate clinical insights using structured and unstructured clinical data. However, hallucinations—fabricated or misleading outputs—pose risks when insights inform clinical or policy decisions. This abstract outline a layered approach for minimizing hallucinations through prompt tuning, human validation, and operational orchestration using LangChain to build robust pipelines for safe LLM use in healthcare.
METHODS: A practical framework was developed combining refined, validated prompts with systematic orchestration. The first layer focuses on crafting and tuning prompts to steer LLMs toward context-specific, factual outputs. Validated prompts are stress-tested with domain experts to ensure alignment with clinical and research requirements. In the second layer, LangChain’s modular architecture operationalizes this by chaining tasks with prompt templates, retrieval-augmented generation (RAG), and agentic control. Prompt templates isolate model instructions from logic, improving consistency and backend flexibility. For example, biomedical records can be parsed using a domain-specific transformer from Hugging Face, enriched with EHR-based retrieval, and passed to GPT-4 for generative explanations. Ollama may be used for on-device inference in cost-sensitive environments. Retrieval-aware prompts ensure relevant context reaches each model. LangChain’s agents dynamically select models based on latency, cost, or performance criteria, automating the validated workflow.
RESULTS: Combining prompt tuning, validation, and structured orchestration reduces hallucinations and improves factual grounding. Human-in-the-loop oversight and prompt audit trails help maintain transparency and trust.
CONCLUSIONS: A layered approach—tuning and validating prompts, then automating with LangChain—equips healthcare analysts with a scalable, interpretable, and compliant way to deploy LLMs. This structure ensures reliable context delivery, dynamic model selection, and secure integration into clinical workflows, bridging LLM potential with real-world healthcare requirements. Ongoing refinement remains essential as generative AI adoption expands.

Conference/Value in Health Info

2025-11, ISPOR Europe 2025, Glasgow, Scotland

Value in Health, Volume 28, Issue S2

Code

MSR4

Topic

Methodological & Statistical Research, Real World Data & Information Systems, Study Approaches

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Presentation (CTI)