GPU infrastructure is the largest capital decision in enterprise AI. Get it right and you build a cost advantage that compounds. Get it wrong and you are locked into expensive hardware that depreciates while better alternatives emerge.

The economics have shifted significantly in 2025 and 2026. New GPU generations, falling cloud prices, and improved availability have changed the calculus — but the DACH-specific factors of energy costs, regulatory requirements, and depreciation rules add complexity that generic US-centric analyses miss.

The hardware landscape in 2026

NVIDIA H100. The workhorse of 2024–2025 inference. Purchase price has stabilised at $30,000 to $40,000 per unit, according to IntuitionLabs' 2026 pricing guide. An 8-GPU HGX H100 server runs over $250,000 including chassis, networking, and storage. Cloud rental costs $2.50 to $3.50 per hour on mid-tier providers (Spheron, Lambda), or $6.00 to $12.00 per hour on hyperscalers (AWS p5, Azure ND H100).

NVIDIA H200. The 2025–2026 upgrade. 141 GB HBM3e memory (versus 80 GB on H100) enables larger models without multi-GPU setups. An 8-GPU system costs approximately $315,000. Cloud availability is growing but not yet universal.

NVIDIA L40S. The cost-efficient alternative for inference workloads that do not require the full H100 capability. At $8,000 to $12,000 per unit, it runs 7B to 13B models efficiently and fits in standard data centre racks without liquid cooling. For Mittelstand companies running small-to-mid-range models, this is often the right hardware choice.

Used A100 80GB. Now available at $8,000 to $12,000 per unit on the secondary market — 40 to 50 percent below original pricing. For companies deploying proven workloads that do not need the latest generation, used A100s offer excellent price-performance.

The three-year TCO comparison

For a representative Mittelstand workload — one production model at 20 million tokens per day — here is the three-year total cost of ownership:

Cloud GPU (reserved instances). $3,000 to $4,000 per month for an H100 on a 1-year commitment with a mid-tier provider. Three-year cost: $108,000 to $144,000. Includes hardware, networking, cooling, and basic management. Excludes ML engineering time for deployment and monitoring.

On-premise purchase. Initial hardware: $40,000 (single H100) plus $15,000 for server chassis, networking, and installation. Annual operational costs: $6,000 to $10,000 for electricity (at German industrial rates of €0.20 to €0.25 per kWh for a 700W continuous draw), $3,000 to $5,000 for maintenance and cooling, plus rack space rental if not in-house. Three-year cost: $82,000 to $100,000 in infrastructure alone — before ML engineering labour.

Hybrid. Cloud GPU for development and spiky workloads. Reserved or on-premise GPU for steady-state production. Three-year cost: $75,000 to $110,000 depending on the split. This is the architecture most cost analyses recommend.

DACH-specific factors

Energy costs. Germany's commercial electricity prices are among Europe's highest, according to CNBC's 2026 analysis of European AI energy economics. A single H100 drawing 700W continuously consumes approximately 6,130 kWh annually. At €0.22 per kWh, that is €1,350 per year per GPU — modest for a single card, but significant when scaled. An 8-GPU setup costs €10,800 annually in electricity alone. In comparison, US data centre operators pay roughly half.

The Energy Efficiency Act. Under Germany's EnEfG, data centres must source 50 percent renewable electricity since 2024 and 100 percent from January 2027. For on-premise GPU installations, this means either sourcing renewable energy contracts (which carry a 10 to 20 percent premium in Germany) or purchasing renewable energy certificates. This adds both cost and procurement complexity.

Depreciation under HGB. Under German commercial accounting rules (HGB), computer hardware is typically depreciated over 3 years (linear). GPU hardware purchased for AI infrastructure follows the same schedule. This provides a tax advantage for on-premise purchase — the full cost is expensible over 36 months. Cloud GPU costs are immediately expensible as operating expenditure. The choice between CapEx and OpEx depends on your company's financial situation and tax position.

Data centre capacity. Frankfurt is Europe's largest data centre market and among its most constrained. Rack space costs €150 to €300 per kW per month in Frankfurt colocation facilities, with availability tightening. Companies outside Frankfurt face longer lead times for colocation. Munich, Berlin, and Hamburg offer alternatives but with less connectivity infrastructure.

The decision framework

Buy on-premise when: volume exceeds 200 million tokens per day on a sustained basis, your team includes at least 2 ML infrastructure engineers, you have existing data centre space with adequate power and cooling, and your three-year planning horizon is stable enough to justify the capital commitment.

Use cloud GPU when: volume is under 50 million tokens per day, your workloads are spiky or unpredictable, you need the flexibility to change GPU generations as new hardware releases, or your team lacks infrastructure engineering capability.

Use hybrid when: you have stable production workloads that justify reserved capacity plus development and experimentation workloads that benefit from cloud flexibility. This is the majority of Mittelstand companies with more than 5 AI workloads.

Book a fit call to model your GPU infrastructure economics. We calculate the TCO for your specific workloads, volumes, and DACH constraints — including energy, depreciation, and regulatory factors that generic calculators miss. Book your fit call →


References: IntuitionLabs, "NVIDIA AI GPU Prices: H100 & H200 Cost Guide," 2026; Spheron, "GPU Cloud Pricing 2026"; GetDeploying, "H100 Cloud Pricing: Compare 43+ Providers," 2026; GMI Cloud, "NVIDIA H100 GPU Pricing 2026: Rent vs. Buy Cost Analysis"; CNBC, "High Energy Prices Could Derail Europe's AI Race," May 2026; German Energy Efficiency Act (EnEfG), 2023; TechPolicy.Press, "Germany's Data Centre Boom Is Pushing the Power Grid to Its Limits," 2026.