Skip to content

ApplexX7/Cortexa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


🧠 Cortexa

From Documents → Intelligence → Agents

An Agentic RAG system that transforms raw documents into a living, conversational knowledge engine — built for scale, isolation, and intelligence.



What is Cortexa?

Cortexa is a production-grade AI knowledge infrastructure system built around Agentic Retrieval-Augmented Generation (RAG).

It is not a "chat with PDF" wrapper. It is designed as a full AI architecture that ingests documents, builds structured knowledge, and serves intelligent answers — isolated per user, per chat, per session.

The goal is to evolve from a RAG engine into a full Agentic AI platform capable of multi-step reasoning, tool execution, and autonomous document workflows.

"AI should not just answer questions — it should understand your data."


Design Philosophy

Cortexa is built on four non-negotiable principles:

Principle Description
🔐 Isolation-first Every entity is strictly scoped: user → chat → file → vector chunks. No leakage across boundaries.
⚙️ Async-first All heavy AI workloads run in background jobs. The API never blocks on intelligence.
🧠 AI modularity The RAG pipeline is fully decoupled via gRPC. The AI Brain is a standalone service.
🚀 Agent-ready Architecture is designed from day one to support tool execution and agentic reasoning.

Architecture

                        Client
                           │
                      NestJS API
                           │
          ┌────────────────┼──────────────────┐
          │                │                  │
        Auth             Files              Chat
          │                │
      PostgreSQL        BullMQ Queue
                            │
                        File Worker
                            │
                          gRPC
                            │
                      Python Brain
                            │
     ┌──────────────────────────────────────────┐
     │  Extract → Chunk → Embed → Retrieve → Rerank │
     └─────────────────────┬────────────────────┘
                           │
                       Qdrant DB
                           │
                          LLM

The NestJS API handles all client-facing logic and orchestrates background jobs via BullMQ. Heavy AI work — chunking, embedding, retrieval, and reranking — lives entirely inside the Python Brain service, accessed through a clean gRPC interface.


Tech Stack

Backend — NestJS

  • NestJS + TypeScript
  • PostgreSQL + TypeORM — users, sessions, file records
  • Redis + BullMQ — async job queue for AI processing
  • JWT — access (15m) + refresh (7d) tokens via HTTPOnly cookies
  • gRPC — microservice bridge to AI Brain

AI Brain — Python

  • Python 3 — gRPC server
  • Qdrant — vector database for semantic search
  • Gemini — LLM layer
  • Sentence-aware chunking — context-preserving document splitting
  • Embedding models — dense vector generation

Project Structure

src/
└── modules/
    ├── auth/          → JWT authentication + session management
    ├── users/         → user lifecycle
    ├── chat/          → conversation isolation + history
    ├── files/         → upload, deduplication, lifecycle tracking
    ├── session/       → refresh token sessions
    ├── queue/         → BullMQ job configuration
    └── ai-service/    → gRPC bridge to Python Brain

What's Built

✅ Authentication System

  • JWT access + refresh token flow
  • HTTPOnly cookie security
  • Multi-device session tracking

✅ File Intelligence Pipeline

  • File upload API
  • SHA256 deduplication engine — no duplicate processing
  • Async processing via BullMQ workers
  • File lifecycle states: uploaded → processing → processed → failed

✅ AI Processing Infrastructure

  • Background ingestion worker
  • gRPC communication layer between NestJS and Python Brain
  • Full separation of AI logic from business logic

✅ RAG Foundation

  • Document chunking pipeline design
  • Qdrant vector database integration
  • Multi-user and multi-chat vector isolation model

In Progress

  • gRPC Brain ingestion service — completing the full ingestion flow
  • Sentence-aware chunking — context-preserving splits
  • Reranking system — improving retrieval precision
  • Chat memory optimization — compressing long conversation context
  • Streaming responses — real-time answer delivery

Missing / Planned

These are the next intelligence layers on the roadmap:

  • Hybrid search — BM25 + vector fusion for better recall
  • Context compression — smart pruning of retrieved chunks
  • Long-term memory — persistent user/chat knowledge
  • Conversation summarization — memory-efficient history
  • Agent execution layer — tool usage + multi-step reasoning

Roadmap

Phase 1 → RAG Engine          (in progress)
Phase 2 → Intelligence Layer  (hybrid search, memory, reranking)
Phase 3 → Agent System        (tools, reasoning, workflows)
Phase 4 → AI OS               (full knowledge operating system)

Data Isolation Model

Every piece of knowledge is strictly scoped to its owner:

user_id → chat_id → file_id → vector chunks

There is no cross-user leakage and no cross-chat contamination. Each knowledge space is fully isolated at the vector level.


Contributing

Cortexa is an open, evolving project. Contributions are welcome across:

  • AI / RAG — chunking strategies, retrieval tuning, reranking, hybrid search
  • Backend — NestJS modules, queue optimizations, gRPC services
  • Python Brain — embedding pipeline, memory systems, agent tooling
  • Documentation — architecture diagrams, guides, API docs

If you're interested in contributing, open an issue to discuss what you'd like to work on. The project is still being actively built, so coordination matters.


Running Locally

Full setup guide coming soon. The system requires NestJS, Python 3, PostgreSQL, Redis, and Qdrant running locally or via Docker.

# Clone the repo
git clone https://github.com/your-username/cortexa.git
cd cortexa

# Set up environment variables
cp .env.example .env

# Run  Make and hes going to run all the projects with dependancies
Make


# Start services (Docker Compose coming soon)

License

MIT License — © 2026 Cortexa

About

AI knowledge infrastructure — ingest documents, build memory, chat with intelligence. Agentic RAG with NestJS, Python & Qdrant.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors