Cicerone is a self-hosted AI gateway with Modular RAG, Telegram/Discord bots, fleet management, and an intelligent document retrieval pipeline — all in pure Go.
Self-hosted AI that works the way you work. No cloud dependencies, no vendor lock-in.
WebSocket gateway with Telegram and Discord bot adapters. Connect any LLM through Ollama, llama.cpp, or OpenAI-compatible APIs.
6-stage pipeline: Search → Retrieve → Memory → Fusion → Adapt → Predict. Point at any directory, index it, query it intelligently.
Full terminal UI with real-time logs, fleet status, connection monitoring, and command execution. 15MB binary, zero dependencies.
TLS termination, rate limiting, audit logging, session persistence. Built for production self-hosting with security in mind.
Execute commands across multiple machines via SSH. Monitor fleet health, run diagnostics, and manage nodes from one place.
Extensible plugin architecture. Web search, TTS, and RAG ship as built-in plugins. Write your own in Go.
Six composable modules that adapt to your query type. Factual → concise. Comparative → grouped. Procedural → ordered.
"What is machine learning?" → 3 results, concise format, direct answer
"Python vs Go performance" → 6 results, grouped format, side-by-side
"How to deploy Flask?" → 5 results, ordered steps, sequential answer
From zero to querying your documents in under 2 minutes.
Self-hosted, privacy-first, extensible. No cloud lock-in.