Every failing AI initiative I have diagnosed in the past three years had a working model. The language model did what it was supposed to do when you fed it clean data in a controlled test. The problem was never the model. The problem was everything that happened before the model received its input.
This is the context layer — the second component of the AI Operating System. It defines how data reaches the AI workflow, in what shape, at what speed, and with what domain knowledge attached. Get it right and a mediocre model produces excellent results. Get it wrong and a state-of-the-art model produces nothing usable.
In The AI Operating System, I dedicate an entire chapter to this component because it is the most consistently underestimated factor in enterprise AI. Companies invest in model selection, prompt engineering, and fine-tuning while ignoring the fact that their data cannot reach the workflow in a usable form.
What context actually means
Context is not a synonym for data. Data is raw material. Context is data that has been made accessible, shaped for the task, kept current, and enriched with domain knowledge. Four properties define a functioning context layer:
Data accessibility
Can the AI workflow access the data it needs without manual intervention? Not "do we have the data" — that question is almost always answered yes. The question is whether there is a programmatic path from where the data lives to where the model needs it.
In a typical DACH Mittelstand company, the answer involves SAP, a handful of Excel files maintained by specific individuals, a document management system that predates the smartphone, and tribal knowledge locked in the heads of three people who have been with the company for twenty years.
The data exists. The path does not.
Data quality
Quality does not mean perfection. It means fitness for purpose. An AI workflow that classifies incoming insurance claims needs the claim description, damage type, and policy number. If those three fields are consistently populated, the data quality is sufficient — even if the customer address has formatting inconsistencies.
The mistake is treating data quality as a binary state. Either "our data is clean" or "our data is a mess." Neither framing is useful. The useful question is: for this specific workflow, are the required fields sufficiently populated and consistent to produce reliable outputs?
Data freshness
How old is the data when it reaches the model? A claims triage workflow that processes yesterday's claims is useful. One that processes last month's claims is useless. A product recommendation engine using last week's inventory data will recommend out-of-stock items.
Freshness requirements vary by workflow. Some need real-time data. Most need data that is less than 24 hours old. Almost none can tolerate data that is more than a week stale.
Domain context
Raw data without domain context produces raw outputs. The model needs to know that "Kaskoschaden" is not the same as "Haftpflichtschaden." It needs to know that a €5,000 claim on a commercial policy is routine but a €5,000 claim on a private liability policy is unusual. It needs to understand that "Lieferant A" has been reliable for fifteen years and "Lieferant B" defaulted twice last quarter.
Domain context is the institutional knowledge that experienced employees carry in their heads. Making it explicit and available to the AI workflow is one of the hardest and most valuable parts of building the context layer.
The gap between "we have data" and "the AI can use it"
Every enterprise I work with has data. No enterprise I work with has data that is immediately usable by an AI workflow. The gap between possession and usability is the context layer problem, and it manifests in predictable patterns across DACH industries.
Pattern 1: Data locked in SAP
The data is in SAP. The SAP system has no modern API. Extracting data requires either a custom ABAP report (which IT estimates will take four months) or a manual export (which someone runs every Friday afternoon). Neither is a foundation for a production AI workflow.
This pattern is especially common in manufacturing and retail. The data that would make AI workflows transformative — order histories, inventory levels, supplier performance — is technically accessible but practically locked behind integration barriers.
Pattern 2: Excel as integration layer
The real data pipeline is a collection of Excel files maintained by specific individuals. The procurement specialist has a spreadsheet that tracks supplier lead times. The quality manager has one that logs defect patterns. The sales director has one with customer segment profitability.
These spreadsheets contain genuine business intelligence. They also represent single points of failure, are never version-controlled, and cannot be accessed programmatically.
Pattern 3: Tribal knowledge
The most valuable context is not in any system. It is in the judgment of experienced employees. The claims handler who knows that a specific repair shop inflates estimates. The procurement manager who knows that a particular supplier's quoted lead times are always two weeks optimistic. The customer service agent who can tell from the tone of a complaint whether it will escalate.
This knowledge is real, valuable, and completely invisible to any AI system that does not have a mechanism to capture and encode it.
Building the context layer
The context layer is not a data warehouse project. It is a focused effort to make the right data available for specific workflows. Here is the practical architecture.
Step 1: Map data requirements per workflow
Start with the workflow, not the data. For each AI workflow you plan to deploy, document exactly what data the model needs to produce its output. Be specific: field names, formats, freshness requirements, volume.
An insurance claims triage workflow might need: claim description (free text), damage category (coded), claim amount (numeric), policy type (coded), customer history (last 5 claims). That is five data points. Not "all customer data." Five specific fields.
Step 2: Trace the data path
For each required data point, trace the path from where it lives to where the model needs it. Is there an API? A database connection? A file export? A person who copies it manually?
Document the current path honestly. If the path is "Maria exports it from SAP every Friday and emails it to Thomas," write that down. That is your starting point.
Step 3: Build the minimum viable pipeline
You do not need a real-time streaming architecture for Level 1. You need a reliable, automated pipeline that delivers the required data in the required format at the required frequency.
For many DACH Mittelstand workflows, this means: a scheduled database query or API call that runs nightly, deposits a structured file in a defined location, and triggers a validation check. If the file is valid, the AI workflow processes it. If not, an alert fires.
This is not glamorous. It is reliable. And reliability is the entire point.
Step 4: Encode domain context
Take the tribal knowledge and make it explicit. This typically means building a knowledge base — structured documents that capture the rules, exceptions, and judgment heuristics that experienced employees use.
For a claims triage workflow: a table mapping damage types to expected claim ranges. A list of flagged repair shops. Rules for when a claim should be escalated regardless of amount. These are not training data for the model — they are reference material that the workflow consults during processing.
Retrieval-Augmented Generation (RAG) architectures are the standard approach here. The model retrieves relevant domain context from the knowledge base before generating its output. The quality of the knowledge base directly determines the quality of the output.
Step 5: Establish freshness guarantees
Define and enforce freshness SLAs for each data source. If the claims data must be less than 24 hours old, build monitoring that alerts when the pipeline has not run. If the knowledge base must reflect current policies, assign an owner who reviews it monthly.
Freshness is not a technical property — it is an operational commitment. Someone must own it.
The Context Readiness checklist
Before deploying any AI workflow, assess the context layer against these criteria:
Data accessibility:
- Can each required data point be accessed programmatically?
- Is the data path documented and owned by a named person?
- Can a staging/test environment be provisioned without touching production?
Data quality:
- Are the workflow-critical fields sufficiently populated (80%+ fill rate)?
- Are coded fields consistent (no duplicate categories, no free-text in coded fields)?
- Is there a process for handling missing or invalid data?
Data freshness:
- Is the freshness requirement defined for each data source?
- Is there automated monitoring for pipeline failures?
- Does the current pipeline meet the freshness requirement?
Domain context:
- Have domain experts been interviewed to capture decision heuristics?
- Is the knowledge base structured and queryable?
- Is there an owner responsible for keeping the knowledge base current?
Score each area as blocking, weak, adequate, or strong. Two or more blocking scores mean the context layer needs work before workflow deployment is viable.
Why context compounds
The context layer is not a one-time build. It is an asset that appreciates over time. Every workflow that uses the context layer validates and enriches the underlying data. Claims that are correctly triaged confirm the accuracy of the domain rules. Claims that are incorrectly triaged reveal gaps that, when addressed, improve the next cycle.
This is how AI initiatives compound — not through better models, but through richer context. The learning component of the AI Operating System formalises this feedback loop. But it starts with the context layer.
The companies that build a production-grade context layer for their first workflow find that the second workflow is dramatically easier. The data paths exist. The knowledge base is started. The freshness guarantees are in place. The marginal cost of context drops with every deployment.
Where to start
If you suspect your AI initiatives are failing on context rather than models — and statistically, they almost certainly are — start with two actions. First, take any stalled AI project and map its actual data path from source to model. Document every manual step, every Excel handoff, every tribal knowledge dependency. Second, compare that map against the Context Readiness checklist above.
The gap between where you are and where you need to be is your context layer project. It is less exciting than model selection. It is more important than everything else combined.
The full implementation guide for the context layer, including templates and worked examples, is in Chapter 04 of The AI Operating System.
For a conversation about building the context layer for your first AI workflow, book a Fit Call.