Both Greptile and CodeRabbit use LLMs, but you can't dump 50K lines of code into a prompt.
- RAG: Converts code into chunks and stores them as vectors. When a PR comes in, the system searches for the most relevant chunks.
- Knowledge Graphs: Instead of just text search, this uses Tree-sitter to parse code into an AST so the AI knows exactly where "Function X" is defined.
- LSP Integration: Uses the Language Server Protocol to track definitions and references across the whole project.
The setup needs a GitHub app, a backend webhook, and the GitHub API to post comments on specific lines. It also needs a way to parse git diffs.
The Code Reviewer logic:
- Summarizes the PR intent.
- Runs security and logic scans for things like SQL injection.
- Uses a second AI agent to validate the first one and prevent hallucinations.
The stack:
- ChromaDB for vectors.
- Tree-sitter for parsing.
- LangChain for orchestration.
- Python and Elixir for the backend.
The workflow:
- User opens a PR.
- App pulls the git diff.
- If a function changed, it grabs related context from the vector store.
- Diff + context + PR info goes to the LLM.
- LLM spits out line numbers and comments.
Starting with a CLI tool. It’ll just be git diff -> parse -> spawn task -> print terminal suggestions.
Elixir is perfect for the reviewer part. You can have separate workers for snippets, leak detection, and logic. If the security worker crashes, the supervisor tree just restarts it.
I'll use Groq for tool calling because it's fast and cheap. This lets the LLM read files or make commits. I can also mix models: a fast 8B model
Both Greptile and CodeRabbit use LLMs, but you can't dump 50K lines of code into a prompt.
The setup needs a GitHub app, a backend webhook, and the GitHub API to post comments on specific lines. It also needs a way to parse git diffs.
The Code Reviewer logic:
The stack:
The workflow:
Starting with a CLI tool. It’ll just be git diff -> parse -> spawn task -> print terminal suggestions.
Elixir is perfect for the reviewer part. You can have separate workers for snippets, leak detection, and logic. If the security worker crashes, the supervisor tree just restarts it.
I'll use Groq for tool calling because it's fast and cheap. This lets the LLM read files or make commits. I can also mix models: a fast 8B model