What Prompt Engineering Missed and How Context Engineering Fixes It
Author(s)
Hanan Irfan, MSc, Tushar Srivastava, MSc.
ConnectHEOR, London, United Kingdom.
ConnectHEOR, London, United Kingdom.
OBJECTIVES: The use of large language models (LLMs) in health economics and outcomes research (HEOR) is expanding, yet their success critically depends on how we design their working environments. Historically, "prompt engineering" or, crafting clever questions and examples, was seen as the key to unlocking LLM capabilities. However, in complex and niche domains like HEOR, prompt engineering alone has proven insufficient. The emerging discipline of “context engineering” now provides a more robust framework by shaping the entire operational context within which the LLM operates. This study compares the outputs of prompt-only versus context-engineered AI systems in executing domain-specific tasks within HEOR.
METHODS: Context engineering goes beyond user prompt and includes system messages, chat-history, retrieval-augmented generation (RAG), domain-specific tool integrations, and structured data provisioning. In technical domains like HEOR, where pre-trained models often lack adequate exposure to domain-specific concepts, methods, and terminology, additional scaffolding is required. A comparative case analysis was performed across three HEOR use cases: (1) automated report generation for cost-effectiveness models, (2) chatbot-based model explainers, and (3) abstract summarization from regulatory documents. For each use-case, outputs from prompt-only configurations were compared to those using full context engineering.
RESULTS: Prompt-engineering approaches such as Chain-of-thoughts and Self-Consistency, failed to generate reliable outputs for HEOR-specific tasks, particularly when source knowledge was fragmented across spreadsheets, models, or PDFs. In contrast, context-engineered solutions enabled accurate reasoning, especially when external memory or tool access was provided. Domain grounding improved significantly, and outputs were more aligned with technical expectations.
CONCLUSIONS: Prompt engineering is no longer sufficient for domain-specific tasks like those in HEOR where LLMs are prone to hallucination. In niche fields where pretrained models lack exposure, context engineering is indispensable. As AI adoption in HEOR accelerates, success will hinge less on clever prompts and more on well-structured, tool-integrated, and context-aware systems.
METHODS: Context engineering goes beyond user prompt and includes system messages, chat-history, retrieval-augmented generation (RAG), domain-specific tool integrations, and structured data provisioning. In technical domains like HEOR, where pre-trained models often lack adequate exposure to domain-specific concepts, methods, and terminology, additional scaffolding is required. A comparative case analysis was performed across three HEOR use cases: (1) automated report generation for cost-effectiveness models, (2) chatbot-based model explainers, and (3) abstract summarization from regulatory documents. For each use-case, outputs from prompt-only configurations were compared to those using full context engineering.
RESULTS: Prompt-engineering approaches such as Chain-of-thoughts and Self-Consistency, failed to generate reliable outputs for HEOR-specific tasks, particularly when source knowledge was fragmented across spreadsheets, models, or PDFs. In contrast, context-engineered solutions enabled accurate reasoning, especially when external memory or tool access was provided. Domain grounding improved significantly, and outputs were more aligned with technical expectations.
CONCLUSIONS: Prompt engineering is no longer sufficient for domain-specific tasks like those in HEOR where LLMs are prone to hallucination. In niche fields where pretrained models lack exposure, context engineering is indispensable. As AI adoption in HEOR accelerates, success will hinge less on clever prompts and more on well-structured, tool-integrated, and context-aware systems.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
MSR222
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
No Additional Disease & Conditions/Specialized Treatment Areas