EU-Sovereign AI: How We Built InnConnect Without American Clouds
The sovereignty question
When we started building InnConnect — our multi-tenant AI chatbot platform — the technical question was straightforward: how do we build a knowledge-base-powered chatbot that gives accurate, contextual answers? The harder question was: where does the data live, and who can access it?
Under the US CLOUD Act, American cloud providers can be compelled to hand over data stored anywhere in the world. For European businesses handling customer conversations, personal data, and knowledge base content, this isn't a theoretical risk. It's a compliance gap that gets wider every year as EU data sovereignty regulations tighten.
We decided to build InnConnect as a fully EU-sovereign platform from day one. Not as a feature to be bolted on later — as an architectural principle.
Architecture: RAG without American clouds
InnConnect uses Retrieval-Augmented Generation (RAG) to answer customer questions using a business's own knowledge base. Here's how the architecture works:
- Knowledge Base. Businesses upload articles, FAQs, product information, and support documentation. Content is chunked, embedded, and stored in PostgreSQL with pgvector for vector similarity search. All EU-hosted.
- Retrieval. When a customer asks a question, we perform semantic search across the knowledge base to find the most relevant context. This happens entirely within our EU infrastructure.
- Generation. The retrieved context is sent to the LLM along with the customer's question and a carefully structured system prompt. We support multiple AI providers and route requests based on data sensitivity and performance requirements.
- Multi-tenant isolation. Each tenant has its own database, its own knowledge base, and its own conversation history. A chatbot for Company A never sees Company B's data. Not through access controls — through complete database separation.
Prompt injection defence
When you put user-generated content and AI-generated responses together, prompt injection becomes a real attack vector. We implemented defence in depth:
- Structural separation. Knowledge base content is wrapped in clear delimiters that tell the LLM to treat it as reference data, not instructions. User messages are similarly isolated.
- Instruction reinforcement. After every block of external content, we reinforce the system instructions. This makes it harder for injected instructions to override the chatbot's behaviour.
- Server-side content sanitisation. All content — user input, LLM output, knowledge base articles, scraped web content — passes through server-side HTML sanitisation before storage. Suspicious patterns are flagged and logged.
- SSRF protection. URL validation blocks requests to internal networks, cloud metadata endpoints, and non-HTTP protocols. No one's using the chatbot to scan our infrastructure.
What multi-agent means in practice
InnConnect isn't a single chatbot. It's a multi-agent system where different agents handle different aspects of a customer interaction:
- Intake agent. Understands what the customer needs through conversational questions.
- Knowledge agent. Searches the knowledge base and provides factual answers.
- Escalation agent. Determines when a conversation needs human intervention and routes it to the right team member.
- Matching agent. On marketplace platforms, connects the customer with the most suitable professional based on the conversation context.
These agents share context but have distinct responsibilities. The orchestration layer decides which agent handles each turn of the conversation based on intent classification and conversation state.
The compliance advantage
Building EU-sovereign from the start gives InnConnect tenants concrete compliance advantages:
- GDPR by architecture. Data processing agreements, right to erasure, data portability — these aren't retrofitted features. They're built into the data model.
- No CLOUD Act exposure. Customer conversations and knowledge base content stay in EU data centres, operated by a European company. No American subpoena can compel disclosure.
- Audit trail. Every conversation, every knowledge base change, every admin action is logged with attribution. 365-day retention for audit logs, 90-day retention for security logs.
Performance in production
EU sovereignty doesn't mean compromising on performance. InnConnect handles conversations with sub-second retrieval times, supports real-time typing indicators, and scales horizontally across our Kubernetes cluster. The AI response latency is dominated by LLM inference time, not our infrastructure.
Rate limiting protects against abuse: per-session chat limits, per-user API limits, and per-IP registration limits. All configurable per tenant.
Why this matters now
The EU Data Act entered into force in January 2024 and will be fully applicable by September 2025. The AI Act is rolling out in phases. For businesses deploying AI-powered customer interactions, the window for "we'll figure out compliance later" is closing fast.
Building on EU-sovereign infrastructure isn't just about avoiding fines. It's about building customer trust. When a business can tell its customers "your conversations are processed entirely within the EU, by a European company, with no third-country data access" — that's a competitive advantage that grows more valuable every year.