v2.0.0 released · 2026-05-26

Your AI, your infrastructure, your rules.

Cicerone is a self-hosted AI gateway with Modular RAG, Telegram/Discord bots, fleet management, and an intelligent document retrieval pipeline — all in pure Go.

8.5MB
Single Binary
100%
Pure Go
6
RAG Modules
<50ms
Query Latency
54+
Unit Tests

Why Cicerone?

Self-hosted AI that works the way you work. No cloud dependencies, no vendor lock-in.

AI Gateway

WebSocket gateway with Telegram and Discord bot adapters. Connect any LLM through Ollama, llama.cpp, or OpenAI-compatible APIs.

WebSocket REST API Prometheus

Modular RAG

6-stage pipeline: Search → Retrieve → Memory → Fusion → Adapt → Predict. Point at any directory, index it, query it intelligently.

ChromaDB 384-dim Configurable

TUI Dashboard

Full terminal UI with real-time logs, fleet status, connection monitoring, and command execution. 15MB binary, zero dependencies.

tcell Real-time 15MB

Security First

TLS termination, rate limiting, audit logging, session persistence. Built for production self-hosting with security in mind.

TLS Audit Rate Limit

Fleet Management

Execute commands across multiple machines via SSH. Monitor fleet health, run diagnostics, and manage nodes from one place.

SSH Multi-node Doctor

Plugin System

Extensible plugin architecture. Web search, TTS, and RAG ship as built-in plugins. Write your own in Go.

web-search tts rag

Modular RAG Pipeline

Six composable modules that adapt to your query type. Factual → concise. Comparative → grouped. Procedural → ordered.

🔍
Search
📚
Retrieve
🧠
Memory
🔀
Fusion
Adapt
🎯
Predict

Factual Queries

"What is machine learning?" → 3 results, concise format, direct answer

Comparative Queries

"Python vs Go performance" → 6 results, grouped format, side-by-side

Procedural Queries

"How to deploy Flask?" → 5 results, ordered steps, sequential answer

Quick Start

From zero to querying your documents in under 2 minutes.

1. Install Cicerone

Terminal
# Download the binary curl -L https://idm.wezzel.com/crab-meat-repos/ \ cicerone-goclaw/-/releases/latest/downloads/ \ cicerone_linux_amd64 -o cicerone chmod +x cicerone sudo mv cicerone /usr/local/bin/ # Verify cicerone version

2. Install ChromaDB

Terminal
# Option A: pip pip install chromadb chroma --host 0.0.0.0 --port 8000 & # Option B: Docker docker run -d -p 8000:8000 chromadb/chroma

3. Configure RAG

config.yaml
# Add to cicerone config.yaml plugins: allow: - web-search - browser - tts - rag # ← Enable RAG env: RAG_CHROMA_URL: "http://localhost:8000" RAG_COLLECTION: "my_notes" RAG_MODULES: "search,retrieve,memory,fusion,adapt,predict"

4. Index & Query

Terminal
# Index your documents cicerone rag index ~/notes # Query your knowledge base cicerone rag query "How do I deploy Flask?" # Or just chat — RAG auto-injects context cicerone chat > ask my notes about authentication

Ready to own your AI?

Self-hosted, privacy-first, extensible. No cloud lock-in.