Features
- π Privacy-First: All data processing happens locally by default
- β‘ Fast & Efficient: Local embeddings and optimized vector search
- π€ Multiple LLM Support: Works with Groq (cloud)
- πΎ Local Storage: SQLite database for embeddings and context
- π― Smart Chunking: Intelligent code splitting with configurable overlap
- π Streaming Output: Real-time response streaming for better UX
- π¨ Multiple File Types: Supports TypeScript, JavaScript, Python, Go, Rust, Java, and more
- π§ Smart Configuration: Automatically detects project languages and optimizes config
- π‘οΈ Intelligent Filtering: Automatically excludes binaries, large files, and build artifacts
- βοΈ Highly Configurable: Fine-tune chunking, retrieval, and model parameters
- π Zero Setup: Works out of the box with sensible defaults
[!WARNING] Codebase Size Limitation: Codexa is optimized for small to medium-sized codebases. It currently supports projects with up to 200 files and 20,000 chunks. For larger codebases, consider using more restrictive
includeGlobspatterns to focus on specific directories or file types.
How It Works
Codexa uses Retrieval-Augmented Generation (RAG) to answer questions about your codebase:
1. Ingestion Phase
When you run codexa ingest:
- File Discovery: Scans your repository using glob patterns (
includeGlobs/excludeGlobs) - Smart Filtering: Automatically excludes binaries, large files (>5MB), and build artifacts
- Code Chunking: Splits files into manageable chunks with configurable overlap
- Embedding Generation: Creates vector embeddings for each chunk using local transformers
- Storage: Stores chunks and embeddings in a SQLite database (
.codexa/index.db)
2. Query Phase
When you run codexa ask:
- Question Embedding: Converts your question into a vector embedding
- Vector Search: Finds the most similar code chunks using cosine similarity
- Context Retrieval: Selects top-K most relevant chunks as context
- LLM Generation: Sends question + context to your configured LLM
- Response: Returns an answer grounded in your actual codebase
Benefits
- Privacy: All processing happens locally by default
- Speed: Local embeddings and vector search are very fast
- Accuracy: Answers are based on your actual code, not generic responses
- Context-Aware: Understands relationships across your codebase
Architecture
βββββββββββββββββββ
β User Query β
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ ββββββββββββββββ
β Embedding ββββββΆβ Vector β
β Generation β β Search β
βββββββββββββββββββ ββββββββ¬ββββββββ
β
βΌ
ββββββββββββββββ
β Context β
β Retrieval β
ββββββββ¬ββββββββ
β
βΌ
βββββββββββββββββββ ββββββββββββββββ
β SQLite DB βββββββ LLM β
β (Chunks + β β (Groq) β
β Embeddings) β ββββββββ¬ββββββββ
βββββββββββββββββββ β
βΌ
ββββββββββββββββ
β Answer β
ββββββββββββββββKey Components:
- Chunker: Splits code files into semantic chunks
- Embedder: Generates vector embeddings (local transformers)
- Retriever: Finds relevant chunks using vector similarity
- LLM Client: Generates answers (Groq cloud)
- Database: SQLite for storing chunks and embeddings