Documentation

Everything you need to get Cicerone running with RAG.

Install Cicerone

Bash
curl -L https://idm.wezzel.com/crab-meat-repos/ \ cicerone-goclaw/-/releases/latest/downloads/ \ cicerone_linux_amd64 -o cicerone chmod +x cicerone && sudo mv cicerone /usr/local/bin/

Install ChromaDB

Bash
# pip install pip install chromadb chroma --host 0.0.0.0 --port 8000 & # or Docker docker run -d -p 8000:8000 -v chroma_data:/chroma/chroma \ chromadb/chroma

Configure

config.yaml
gateway: listenAddr: "0.0.0.0:14170" plugins: allow: [web-search, browser, tts, rag] env: RAG_CHROMA_URL: "http://localhost:8000" RAG_COLLECTION: "my_notes" RAG_TOP_K: "5" RAG_MODULES: "search,retrieve,memory,fusion,adapt,predict"

Commands

RAG Commands
# Index a directory into ChromaDB cicerone rag index ~/notes # Query your knowledge base cicerone rag query "How do I set up TLS?" # List indexed collections cicerone rag list # Show pipeline status cicerone rag status # Remove a collection cicerone rag remove old_collection

Injection Modes

Control how RAG context enters your LLM conversations.

Auto

Always inject RAG context when a collection exists. Zero-friction intelligence.

RAG_INJECT_MODE: "auto"

On-Demand

Only inject when triggered: "ask my notes", "check my documents", "from my notes".

RAG_INJECT_MODE: "on-demand"

Off

Disable RAG injection entirely. Still available via explicit commands.

RAG_INJECT_MODE: "off"

Architecture

Package Structure
rag/ ├── rag.go # Version, module registry ├── config.go # RAGConfig, defaults, validation ├── types.go # Result, Chunk, PipelineState, PipelineResult ├── store.go # Store interface (ChromaDB abstraction) ├── chroma_client.go # ChromaDB REST client ├── embedder.go # Embedder interface + ChromaEmbedder ├── chunker.go # 3 chunking strategies ├── walker.go # Directory walker ├── indexer.go # Walk → Chunk → Store pipeline ├── modules.go # Module interface + registry ├── module_search.go # Query classification ├── module_retrieve.go # ChromaDB query + MockStore ├── module_memory.go # LRU query cache ├── module_fusion.go # Dedup + sort by relevance ├── module_adapt.go # Format selection per query type ├── module_predict.go # Structured answer generation ├── pipeline.go # Orchestrator: Execute(), partial results ├── plugin_rag.go # RAGPlugin (cicerone Plugin interface) ├── context_injector.go # LLM context injection ├── collection_manager.go # Multi-collection management └── resilience.go # QueryCache + CircuitBreaker