🧠 Cortexa

From Documents → Intelligence → Agents

An Agentic RAG system that transforms raw documents into a living, conversational knowledge engine — built for scale, isolation, and intelligence.

What is Cortexa?

Cortexa is a production-grade AI knowledge infrastructure system built around Agentic Retrieval-Augmented Generation (RAG).

It is not a "chat with PDF" wrapper. It is designed as a full AI architecture that ingests documents, builds structured knowledge, and serves intelligent answers — isolated per user, per chat, per session.

The goal is to evolve from a RAG engine into a full Agentic AI platform capable of multi-step reasoning, tool execution, and autonomous document workflows.

"AI should not just answer questions — it should understand your data."

Design Philosophy

Cortexa is built on four non-negotiable principles:

Principle	Description
🔐 Isolation-first	Every entity is strictly scoped: `user → chat → file → vector chunks`. No leakage across boundaries.
⚙️ Async-first	All heavy AI workloads run in background jobs. The API never blocks on intelligence.
🧠 AI modularity	The RAG pipeline is fully decoupled via gRPC. The AI Brain is a standalone service.
🚀 Agent-ready	Architecture is designed from day one to support tool execution and agentic reasoning.

Architecture

                        Client
                           │
                      NestJS API
                           │
          ┌────────────────┼──────────────────┐
          │                │                  │
        Auth             Files              Chat
          │                │
      PostgreSQL        BullMQ Queue
                            │
                        File Worker
                            │
                          gRPC
                            │
                      Python Brain
                            │
     ┌──────────────────────────────────────────┐
     │  Extract → Chunk → Embed → Retrieve → Rerank │
     └─────────────────────┬────────────────────┘
                           │
                       Qdrant DB
                           │
                          LLM

The NestJS API handles all client-facing logic and orchestrates background jobs via BullMQ. Heavy AI work — chunking, embedding, retrieval, and reranking — lives entirely inside the Python Brain service, accessed through a clean gRPC interface.

Tech Stack

Backend — NestJS

NestJS + TypeScript
PostgreSQL + TypeORM — users, sessions, file records
Redis + BullMQ — async job queue for AI processing
JWT — access (15m) + refresh (7d) tokens via HTTPOnly cookies
gRPC — microservice bridge to AI Brain

AI Brain — Python

Python 3 — gRPC server
Qdrant — vector database for semantic search
Gemini — LLM layer
Sentence-aware chunking — context-preserving document splitting
Embedding models — dense vector generation

Project Structure

src/
└── modules/
    ├── auth/          → JWT authentication + session management
    ├── users/         → user lifecycle
    ├── chat/          → conversation isolation + history
    ├── files/         → upload, deduplication, lifecycle tracking
    ├── session/       → refresh token sessions
    ├── queue/         → BullMQ job configuration
    └── ai-service/    → gRPC bridge to Python Brain

What's Built

✅ Authentication System

JWT access + refresh token flow
HTTPOnly cookie security
Multi-device session tracking

✅ File Intelligence Pipeline

File upload API
SHA256 deduplication engine — no duplicate processing
Async processing via BullMQ workers
File lifecycle states: uploaded → processing → processed → failed

✅ AI Processing Infrastructure

Background ingestion worker
gRPC communication layer between NestJS and Python Brain
Full separation of AI logic from business logic

✅ RAG Foundation

Document chunking pipeline design
Qdrant vector database integration
Multi-user and multi-chat vector isolation model

In Progress

gRPC Brain ingestion service — completing the full ingestion flow
Sentence-aware chunking — context-preserving splits
Reranking system — improving retrieval precision
Chat memory optimization — compressing long conversation context
Streaming responses — real-time answer delivery

Missing / Planned

These are the next intelligence layers on the roadmap:

Hybrid search — BM25 + vector fusion for better recall
Context compression — smart pruning of retrieved chunks
Long-term memory — persistent user/chat knowledge
Conversation summarization — memory-efficient history
Agent execution layer — tool usage + multi-step reasoning

Roadmap

Phase 1 → RAG Engine          (in progress)
Phase 2 → Intelligence Layer  (hybrid search, memory, reranking)
Phase 3 → Agent System        (tools, reasoning, workflows)
Phase 4 → AI OS               (full knowledge operating system)

Data Isolation Model

Every piece of knowledge is strictly scoped to its owner:

user_id → chat_id → file_id → vector chunks

There is no cross-user leakage and no cross-chat contamination. Each knowledge space is fully isolated at the vector level.

Contributing

Cortexa is an open, evolving project. Contributions are welcome across:

AI / RAG — chunking strategies, retrieval tuning, reranking, hybrid search
Backend — NestJS modules, queue optimizations, gRPC services
Python Brain — embedding pipeline, memory systems, agent tooling
Documentation — architecture diagrams, guides, API docs

If you're interested in contributing, open an issue to discuss what you'd like to work on. The project is still being actively built, so coordination matters.

Running Locally

Full setup guide coming soon. The system requires NestJS, Python 3, PostgreSQL, Redis, and Qdrant running locally or via Docker.

# Clone the repo
git clone https://github.com/your-username/cortexa.git
cd cortexa

# Set up environment variables
cp .env.example .env

# Run  Make and hes going to run all the projects with dependancies
Make


# Start services (Docker Compose coming soon)

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
envirements		envirements
server		server
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
TODO		TODO
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Cortexa

From Documents → Intelligence → Agents

What is Cortexa?

Design Philosophy

Architecture

Tech Stack

Backend — NestJS

AI Brain — Python

Project Structure

What's Built

✅ Authentication System

✅ File Intelligence Pipeline

✅ AI Processing Infrastructure

✅ RAG Foundation

In Progress

Missing / Planned

Roadmap

Data Isolation Model

Contributing

Running Locally

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 Cortexa

From Documents → Intelligence → Agents

What is Cortexa?

Design Philosophy

Architecture

Tech Stack

Backend — NestJS

AI Brain — Python

Project Structure

What's Built

✅ Authentication System

✅ File Intelligence Pipeline

✅ AI Processing Infrastructure

✅ RAG Foundation

In Progress

Missing / Planned

Roadmap

Data Isolation Model

Contributing

Running Locally

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages