Context engineering replaced prompt engineering as the core discipline almost overnight. The framing is now familiar, architect what the model sees on every call: instructions, tools, memory, retrieved documents, and state. Manage the token budget. Fight context rot. Decide what to keep, compress, and evict.
All of this is real and worth doing. But there are two different problems hiding under one phrase, and conflating them is why so many “context-engineered” systems still fail in production.
Two context problems, not one
The first problem is context engineering in the small: The token lifecycle inside a single agent run. What goes in the window, what gets summarised, what gets retrieved just in time, and how to stop the agent forgetting the file it was editing two hours ago. This is a genuine engineering discipline, and the playbooks for it are getting good.
The second problem is context engineering in the large: Where does the right context come from in the first place? Before you can decide what to put in the window, you have to know what the relevant data, documents, definitions, and constraints actually are, and have to be able to produce them on demand, correctly, for this user, this question, this moment.
The first problem is about the window. The second is about the organisation behind the window.
A perfectly tuned context window is worthless if there is nothing trustworthy to fill it with. You can master compression, retrieval ordering, and eviction strategy, and still ship a confidently wrong system, because the meaning it assembled was fragmented, stale, or contested at the source. The asymmetry everyone quotes (“A poor prompt in a well-engineered context often succeeds; a great prompt in a bad context fails.”) has a deeper version: a brilliantly engineered window over an ungoverned knowledge base is just faster nonsense.
Context assembly is the part nobody owns
I have written before that in a modern knowledge ecosystem, context is not stored in a single system. It is assembled at query time. When a user asks a question, the system has to dynamically combine data from the platform, relevant documentation from content systems, and constraints and structure from knowledge models.
Context assembly isn’t another repository. It is a runtime capability that selects relevant signals across data, content, and knowledge structures, applies organisational rules and process logic, and adapts to the user’s role and intent.
This is exactly the “large” problem the context-engineering conversation keeps brushing past. The discipline talks confidently about retrieval, pulling the right documents into the window, while quietly assuming the existence of a coherent, governed body of meaning to retrieve from. In most enterprises, that body doesn’t exist. Definitions conflict between domains. Documented intent sits in SharePoint with no relationship to the data. The organisational model is in someone’s head or a slide from 2021. Retrieval over that is retrieval over chaos.
The missing foundation is semantic, not syntactic
This is where the “large” problem becomes an architecture problem rather than a prompting one.
The industry is, in fact, rediscovering this from the implementation side. GraphRAG (pairing vector retrieval with a knowledge graph for structural reasoning) has become a default pattern for reliable enterprise AI precisely because flat retrieval over documents doesn’t carry enough meaning. Ontology-grounded agents reduce hallucination because the ontology supplies what the text alone cannot: typed relationships, constraints, canonical identifiers, and the explicit class structure of the domain. The lesson keeps arriving by a different road: structure and meaning underneath the retrieval are what make the retrieval trustworthy.
Standards are converging on the same realisation. The Open Semantic Interchange specification, live since early 2026, exists so that the meaning of a metric travels consistently across tools and AI systems instead of being redefined in every one. That matters. But it governs analytics semantics; structured data inside the platform. The rest of the context, an agent needs to act well (contractual terms, process rules, design rationale, organisational applicability), never appears as a metric and lives entirely outside that scope. A semantic layer is one slice of context architecture, not the whole of it.
The unit of work has moved
Here is the shift that the “context engineering” label undersells. The unit of work used to be the prompt. Then it became the context window. The real unit, for an enterprise, is the assembled context, and the quality of assembly depends on everything underneath it: governed definitions, a maintained ontology, content tied to data, and an organisational model that reflects reality.
That means context engineering, done seriously, isn’t a prompt skill bolted onto an app. It is a knowledge-ecosystem architecture. The window-level discipline still matters enormously, but it is the last mile. The first mile is whether your organisation has anything coherent to assemble.
What this means in practice
A few consequences follow if you take the “large” problem seriously:
- Treat context as a product, with SLAs. Freshness, provenance, ownership, and failure modes are properties of the context you assemble, not afterthoughts. If a definition is stale, the agent is wrong, and you should be able to see why.
- Invest below the model. The often-quoted ratio — for every dollar on AI tooling, several should go to the data and semantic architecture beneath it — is uncomfortable precisely because it is correct. The window is cheap to tune and expensive to feed well.
- Build context assembly as a governed runtime capability, not a pile of retrieval scripts. Someone owns the rules that decide which signals are assembled and which definitions win.
- Put meaning under the retrieval. Ontologies and a semantic layer aren’t academic luxuries; they are what turns “retrieve some similar documents” into “assemble the correct, constrained context for this decision.”
The prompt was never the point
The move from prompt engineering to context engineering was the right instinct, but it stops one layer too high. The prompt was never the point. The context is.
And at enterprise scale, context isn’t a clever arrangement of tokens; it is an architecture: data, content, and knowledge structures, assembled at query time under governed meaning.
You can engineer the window all you like. If there is no knowledge ecosystem behind it, you aren’t grounding your AI. You are just stuffing the context more elegantly.