AI chatbot on your own messenger — 3motiBot proof
Matrix bot with LLM backend, running on my own 3motibot server. Optionally with on-prem Mistral Small 3.1 or a commercial cloud LLM API. Conversation memory, system prompts, domain tuning — everything self-implemented.
What is this?
3motiBot — my own Matrix server at 3motibot.de — has an integrated AI chatbot. Any user can invite the bot to a room or DM it and receive AI answers. The bot uses a large language model as backend and a Python wrapper I wrote myself — for context handling, system prompts and rate limiting.
Technical content
- Matrix pipeline — the bot is a pure Matrix client (Python
matrix-nio). No message leaves the Matrix room unnecessarily; only the user’s text goes to the LLM backend. - Swappable LLM backend — the architecture is built so that on-prem Mistral Small 3.1 or Codestral via vLLM/Ollama can be plugged in directly. Alternatively I integrate a commercial cloud LLM API when maximum answer quality matters more than data sovereignty. Chosen per client project.
- Wrapper framework (self-written):
- Context handling with per-conversation token budget
- System prompts configurable per room/user (persona, domain knowledge, tone)
- Rate limit per user per hour
- Error handling with fallback to secondary backend on API outage
- Persistent conversation history in PostgreSQL, separated per Matrix room.
- Admin commands via DM:
/reset,/persona <name>,/stats.
Where this matters
Scenario A — customer support bot on your own infrastructure: AI handles 70–80 % of first enquiries. On Matrix the conversation history lives on your server, not with a SaaS provider.
Scenario B — internal knowledge assistant: handbooks, processes, FAQs as system prompt or RAG context. The bot answers employee questions, especially valuable for newcomers and documentation navigation.
Scenario C — domain-specific assistant: legal (first review), tax pre-check, technical first diagnosis — bot with domain-specific prompt delivers structured initial answers. Human expertise stays in the loop.
Backend choice: on-prem or cloud API?
| Criterion | Mistral Small 3.1 on-prem | Commercial cloud API |
|---|---|---|
| Answer quality | good, depends on hardware | excellent |
| Privacy | stays fully with you | data goes to the API provider |
| Cost | hardware + power | ~€0.003–0.015/message |
| Hardware requirement | min. 24 GB VRAM (e.g. RTX 4090) | none |
| Availability | internally controlled | cloud-dependent |
My recommendation: if data sovereignty matters, start on-prem straight away. If maximum answer quality matters or you want a quick proof of concept without GPU investment, start with a cloud API and migrate later. The wrapper code works for both backends without changes to bot behaviour.
The offering
Setup of your own AI chatbot including Matrix integration, wrapper code, system prompt design, 30 days post-launch support.
- Typical setup: €3,900–8,900 one-off (depending on use-case complexity)
- Running: from €39/month (server + API budget) or variable with your own API key
- Optional: retainer for prompt tuning and backend updates
Status
In production since 2025 on my own 3motibot server. Offered as a service — on request to first pilot clients at pilot terms.
Outcomes
- Own AI chatbot live at 3motibot.de
- Wrapper framework: context handling, system prompts, rate limit, error handling
- Swappable backend: on-prem LLM or cloud API via config
- Persistent conversation history per Matrix room
- Admin commands via DM: /reset, /persona, /stats
Your own AI chatbot for your team?
Whether customer support, internal knowledge bot or domain-specific assistant — I build the bot on your own Matrix infrastructure. You choose the backend: on-prem Mistral Small 3.1 for strict privacy requirements, or a cloud API for maximum answer quality.
Request a kickoff