Our blog

SovereignAI: powerful LLMs that never leave the EU

Frontier models from the big providers are effectively off-limits for regulated companies, public administration and medicine, the data leaves the EU and the obligations pile up. SovereignAI is our answer: powerful language models and inference capacity operated inside the EU that you deploy without compromising on security. What it runs on, which models, and what it costs versus your own on-prem.

Jozef Krivaček · 6/14/2026 · 7 min read

The problem frontier AI won't solve

Large language models from the big US providers are excellent. The problem is where they run and who can reach your data. The moment you send them the content of a contract, medical records or an internal file, your data leaves the EU and you hand it to a provider who is also subject to US law.

For an ordinary company that is an inconvenience. For a regulated company, public administration and healthcare it is the reason AI is not deployed into sensitive processes at all. Not because they don't want to, but because they are not allowed to.

SovereignAI is our answer: powerful language models and inference capacity operated inside the EU that you deploy into your own processes without the data ever leaving the jurisdiction. In this article we look at what it runs on, which models can run there, who it makes sense for, and what it costs compared with building the whole infrastructure yourself.

Who it's for and why

The common denominator is data sensitivity and regulation. Three typical groups:

Regulated B2B companies (finance, law, energy, manufacturing with know-how). They deal with GDPR, protection of trade secrets and often NIS2 too. A frontier API means sending data to a third party outside the EU, which simply isn't viable for sensitive data.
Public administration and municipalities. Digital sovereignty is not a slogan but a requirement: citizens' data should stay under the institution's control and inside the EU. On top of that come obligations under the EU AI Act for deploying AI in the public sector.
Healthcare and medicine. Health data is, under GDPR, a special category (Art. 9) with the strictest regime. Add to that the European Health Data Space. Sending such data to a US cloud is, in practice, untenable.

The reason "just pick a European region" with a big provider is not enough is simple: if the cloud operator is a US company, it is subject to US law (CLOUD Act, FISA) regardless of where the servers physically stand. Sovereignty is not about where the datacenter is, but about who has legal and actual control over the data.

What it runs on

SovereignAI is not a wrapper over someone else's API. It is real inference capacity that we operate in the EU:

NVIDIA GPU accelerators (H100 / H200) in an EU datacenter, on which the models actually run.
An inference layer optimised for throughput and latency (vLLM, TensorRT-LLM and similar), so you get the most out of the hardware.
Isolation matched to sensitivity: from a logically separated environment (single-tenant) to dedicated capacity just for you, or deployment directly on-premise on your own hardware.
The data stays in the jurisdiction. No training on your inputs, no handover to third parties, auditable access.

SovereignAI inference capacity runs on NVIDIA GPU accelerators in an EU datacenter.

Which models can run there

The strength of the sovereign approach is that it runs on open-weight models, whose weights you download and run in your own environment. Today's open-weight scene is very close in performance to the proprietary models:

Mistral / Mixtral (EU origin), Llama (Meta), Qwen (Alibaba), DeepSeek, or open models from OpenAI (gpt-oss).
We can fine-tune the models on your data or build RAG over them with access to your documents.

There is an important nuance here, one you often ask about. Because the model's weights run locally and your data never leaves the environment, even models of Asian origin (Qwen, DeepSeek) are safe to operate from a data-leak perspective, because they run in isolation and send nothing "home".

The model's origin does, however, stay relevant for a different reason than the data: into the isolated environment the model also brings its behaviour, that is, built-in restrictions, possible bias or security risks trained directly into the weights. This does not leave the environment even when it is fully disconnected. So the choice is driven by who you are and how sensitive the case is:

an ordinary B2B deployment can happily pick the most capable open-weight model regardless of origin,
for the most sensitive environments (medicine, parts of public administration) we curate the choice to models of European origin (Mistral, today fully open-weight and at the performance top), where trust in the model's behaviour and in the supply chain is highest.

We don't pick a model by marketing, but by the task, the language and your sovereignty requirements.

What it costs versus building it yourself

The most common question is: "Why don't we build the whole thing on-prem ourselves?" For some large organisations that makes sense. For most it doesn't, and here is why. Building and operating your own LLM infrastructure means carrying this whole chain of costs:

GPU capex. H100 / H200 class accelerators are expensive, and for meaningful performance you need several, not one. Before buying, factor in lead times too.
Power and cooling. GPU nodes draw a lot of power and need dimensioned supply and cooling, which is a separate investment in the facility.
An operations team. Someone has to build, update, monitor and tune the infrastructure (MLOps / SRE). That is not a half-time job, it is a team.
Depreciation and obsolescence. Hardware ages fast, GPU generations change on the order of a couple of years. What you buy today you are chasing in a few years.
The utilization problem. The most expensive GPU is the one sitting idle. You pay for your own cluster all the time, even when you use it only a few hours a day.

SovereignAI spreads these costs and shifts them onto us. You pay for the capacity you actually use, or for reserved capacity with a predictable fee, instead of tying up capital in hardware that goes obsolete. Capex turns into predictable opex, and operation, updates and utilization are our problem.

Full on-prem makes sense where you need maximum isolation, have high and steady utilization and the capacity for your own operations team. Even then we can help, we'll build it and service it at your site. For everyone else, shared or dedicated sovereign capacity is cheaper and faster.

Full on-premise makes sense where you need maximum isolation and have steady utilization.

Two ways to start

So that you don't have to choose between "nothing" and "your own datacenter", we offer SovereignAI in two modes that can be combined:

Pay-as-you-go API. A sovereign alternative to the big providers: the same convenience as an ordinary AI API, but hosted in the EU and compliant with GDPR and the EU AI Act. Ideal for a fast start and proving value.
Reserved dedicated capacity. Your own GPUs and models, just for you, with strict isolation and a predictable monthly fee. For scale and the most sensitive deployments, including on-prem.

You can start small via the API and "grow into" dedicated capacity the moment volume and data sensitivity call for it. The investment in the first step is not lost.

Where to talk it through

SovereignAI is part of our products and naturally builds on the AI solutions we deploy into companies. If you are weighing AI in an environment where data must not leave the EU, write to us. We'll go through your specific case, the right model, and whether the API or dedicated capacity suits you better.

Jozef Krivaček

CEO, Omnius