Every AI system will eventually have to answer to someone — an auditor, a supervisory authority, a customer, or a board member asking the only question that matters: "How does this thing actually work, and can you prove it?" The real decision is whether you build the ability to answer into the system from the start, or bolt it on later when the pressure arrives and the system is already live.
That second path is the expensive one, and not mainly in engineering hours. The discipline of software engineering has known for decades that the cost of fixing a problem climbs steeply the later you catch it — cheap at the design stage, painful once code is shipped, and worst of all once a system is in production with real data flowing through it. (The famous "100x in production" figure is itself contested, but the direction of travel is not seriously in doubt.) Compliance is no different. A logging gap caught on a whiteboard is a design note. The same gap caught by an auditor eighteen months into operation is a rebuild.
Compliance by design is the alternative. It means treating regulatory requirements as architectural constraints from day one — the same way you treat security, latency, or user experience. Not as an afterthought. Not as a separate workstream that runs in parallel and collides with engineering at the end. As a set of design decisions that shape every technical choice you make.
What compliance by design means for AI
The idea is not new. It is lifted directly from "data protection by design and by default," the principle codified in Article 25 of the GDPR, which obliges controllers to bake data protection into the architecture of processing rather than add it afterwards. Compliance by design simply extends that posture across the wider regulatory surface a DACH Mittelstand AI system now sits on: the GDPR, the EU AI Act, sector supervision such as BaFin in financial services, and a company's own internal governance.
For AI systems specifically, it comes down to six architectural commitments.
Audit logging from the first commit. For high-risk systems, this is not a matter of taste. Article 12 of the AI Act requires that high-risk systems technically allow the automatic recording of events over their lifetime — automatic meaning the system generates the logs itself, lifetime meaning from deployment to decommissioning, not just the current release. The logs have to support identifying risk situations, post-market monitoring, and oversight of operation. Articles 19 and 26 set a six-month minimum retention floor, and for many systems GDPR or sector rules push that longer. But even where a system is not high-risk, logging is the foundation of operational trust: in production, something always eventually goes wrong, and logs are how you diagnose, explain, and fix it. The architecture has to carry the weight — structured, time-stamped, indexed, queryable records in a tamper-evident store, capturing inputs, outputs, model version and parameters, human decisions, and system events. Scattered print statements do not satisfy Article 12, and they do not survive an audit. Build this into the first sprint; retrofitting it later means touching every component in the pipeline.
Data minimisation by architecture. The GDPR demands minimisation; the AI Act demands data governance for high-risk systems under Article 10. Both push in the same direction — process only the data the task needs, for the purpose it was designed for, and nothing more. In practice that means filtering personal data out at the ingestion layer before it ever enters the pipeline. If you are classifying invoices, you do not need the customer's date of birth, so it should be stripped at the door, not carried through and "handled responsibly" downstream. It means purpose-scoped data flows rather than a shared lake where everything reaches everything and purpose limitation becomes unenforceable. It means retention periods that are enforced automatically, with training data, inference logs, and intermediate results purged on schedule. And it means pseudonymising or aggregating wherever the task allows. Deciding to minimise at the point of collection — rather than restrict access after the fact — eliminates whole categories of risk and shrinks your GDPR exposure at the same time.
Human oversight by workflow design. Article 14 requires that high-risk systems be designed so that natural persons can effectively oversee them while they are in use — explicitly including the human-machine interface tools that make oversight possible. The Article is precise about what "effective" means: the overseer must understand the system's capacities and limitations, stay alert to automation bias (the documented tendency to over-rely on a confident-looking output), correctly interpret what the system produces, and retain the authority to disregard, override, or stop it. None of that is a checkbox bolted on at the end — it is workflow design. A named person needs the training, the tooling, and the standing to intervene; outputs have to be surfaced in a form a human can actually evaluate rather than a raw confidence score; and there has to be a defined escalation path for the edge cases. Build an autonomous end-to-end workflow first and try to insert a human checkpoint later, and you are not adding a feature — you are redesigning the workflow. The pattern we use is to give every AI workflow explicit decision points where the output, its confidence, and any flags are shown to a human who approves, modifies, or rejects, and whose decision is logged before the workflow proceeds.
Documentation generated, not written. Article 11 and Annex IV require technical documentation for high-risk systems, drawn up before the system goes live and kept current — covering the system's purpose, design, monitoring, performance, accuracy and its limits, the risk-management system under Article 9, and the post-market monitoring plan. (Annex IV explicitly lets SMEs, including start-ups, provide these elements in a simplified form — relief that matters for the Mittelstand.) Most organisations treat this as a Word document written after the build, by someone who was not in the build, and stale before it is signed. Compliance by design treats documentation as an output of the system itself: model cards generated from the training pipeline, data sheets describing each dataset's provenance and preprocessing, architecture docs living in the repository next to the code, test results versioned with the model, and change logs that record who changed what and why. When an auditor asks for documentation, you generate it — you do not start writing it. That is the whole difference between documentation that is always current and documentation that is always behind.
EU-resident infrastructure by default. Data residency is a live concern under both the GDPR and the AI Act, and for DACH enterprises the cleanest answer is the simplest one: keep everything in the EU. Compute in EU regions or sovereign-cloud providers; inference on EU-resident endpoints rather than quietly routed through US infrastructure; storage inside the EU with no cross-border transfer for processing; and provider agreements that contractually guarantee it. We treat this as non-negotiable on every deployment, because it removes an entire layer of complexity — transfer impact assessments, Standard Contractual Clauses for AI data flows, the post-Schrems uncertainty around US providers — before it can attach to the project.
Continuous monitoring and drift detection. A system that is compliant on launch day is not compliant forever. Models degrade, input distributions shift, and an output that was accurate at deployment can quietly become unreliable six months on — which is precisely why the AI Act ties record-keeping to post-market monitoring rather than to a one-off conformity snapshot. So the monitoring has to be designed in: automated tracking of accuracy and the other task-specific metrics against defined thresholds; statistical drift detection on inputs and outputs; ongoing fairness measurement across the groups that matter for the use case; alerts that route a threshold breach to the human overseer; and a defined incident response so that when monitoring catches something, it is already clear who is notified and how fast they must act. Build the dashboard before you deploy. Left as an afterthought, monitoring is usually never built — and the problem surfaces when a customer complains or an auditor asks.
The cost of not designing for it
We have seen the alternative play out. A team ships an AI workflow that works beautifully — fast, accurate, cleanly integrated. Months later, compliance asks for the audit log behind a specific decision, an explanation of how the model reached an output, and evidence that only the minimum necessary data is processed. The engineers open the codebase and find no structured logs, a model behind a third-party API with no metadata tracking, a pipeline that ingests everything unfiltered, and no human review step.
Fixing that is not incremental, and this is the point practitioners underrate. The pipeline has to be restructured. The model has to be wrapped in logging and explainability layers. The workflow has to be redesigned around a human checkpoint. Filtering has to be inserted at ingestion. In practice the system is rebuilt — and the rebuild runs while the non-compliant version is still in production, carrying live exposure the whole time. The schedule slips, scarce engineering capacity is consumed re-doing solved work, and the bill lands far above what designing it correctly would ever have cost. That is the cost-of-late-fix curve, applied to compliance instead of bugs.
A note on timing, not urgency theatre
It is worth being precise, because the dates are moving. Under the Digital Omnibus political agreement reached in late 2025, the obligations for stand-alone Annex III high-risk systems are set to shift from 2 August 2026 to 2 December 2027 — but only once the amendments are formally adopted and published in the Official Journal, which had not yet happened as of mid-2026. The Article 50 transparency duties stay on the original August 2026 schedule regardless. The honest takeaway is not "panic by August." It is that a system designed for compliance today is ready whichever way the dates land, while a system that needs retrofitting is exposed to the version of the deadline that actually takes effect. Architecture is the only hedge that works under regulatory uncertainty. (We unpack the Omnibus changes in more depth in our analysis of the Digital Omnibus for enterprises.)
Start with architecture, not paperwork
Compliance by design is not about producing more documents. It is about making better architectural decisions, after which the documents fall out of the architecture almost for free: audit logs become audit trails, data minimisation becomes your privacy evidence, human oversight becomes your governance record. This is how the AI Operating System is built — risk classification mapped before design, logging in the first commit, EU-resident infrastructure and monitoring standing up before go-live, and a DPIA completed before anything reaches production. The result is a system that is compliant from day one, not because compliance was added, but because it was never absent.
A Fit Call maps the EU AI Act and GDPR requirements onto your specific use case and outlines an architecture that satisfies them by design — before you build something you would otherwise have to rebuild.
References: EU AI Act, Article 11 (Technical Documentation) and Annex IV; Article 12 (Record-Keeping); Article 14 (Human Oversight), artificialintelligenceact.eu/article/12, /article/14, /article/11, /annex/4; GDPR Article 25 (Data Protection by Design and by Default), gdpr-info.eu/art-25-gdpr; Gibson Dunn, "EU AI Act Omnibus Agreement — Postponed High-Risk Deadlines and Other Key Changes," 2026, gibsondunn.com/eu-ai-act-omnibus-agreement-postponed-high-risk-deadlines-and-other-key-changes.
