Question 1

Does "sovereign AI" really mean no data goes to OpenAI / Anthropic?

Accepted Answer

Yes — the default setup runs the entire LLM on your server (or my Hetzner GPU in Falkenstein). There is no fallback to external APIs unless you explicitly configure that for low-sensitivity use cases.

Question 2

Does Llama 3.3 reach GPT-4 quality?

Accepted Answer

For structured domain tasks (document extraction, summarization, RAG answers) — yes, sometimes better with fine-tuning. For long-form creative writing: slightly behind. We evaluate in project context.

Question 3

Do I need my own hardware?

Accepted Answer

No. Hetzner GPU dedicated servers from ~€200/month are the standard path. Own hardware only for very high load or specific compliance requirements.

Question 4

What are operating costs after launch?

Accepted Answer

GPU hosting €150-500/month depending on model size and load, plus monitoring and updates. Typically 20-40% cheaper than equivalent OpenAI bills — and predictable.

Question 5

How does it integrate with my existing stack?

Accepted Answer

Via REST, GraphQL or WebSocket. Standard patterns: chat widget, document upload, batch processing, webhooks. Also as an MCP server (Model Context Protocol).

Question 6

What about the EU AI Act?

Accepted Answer

On-prem LLMs are easier to document w.r.t. transparency. For high-risk applications I refer AI lawyers — legal assessments aren’t my trade.

AI is a tool. Not an OpenAI subscription.

Why not just OpenAI or Anthropic?

Your own ChatGPT. On your server.

Common questions

Let’s talk.