Skip to content

kight7/VAPT-X

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

SovereignRAG: Zero-Egress Vulnerability Assessment with Retrieval-Augmented Generation

Overview

SovereignRAG is a fully local, zero-egress vulnerability assessment system designed for regulated and air-gapped environments where sensitive network telemetry cannot be transmitted to external cloud services.

The project combines:

  • Nmap-based network scanning
  • Parent-Child Retrieval-Augmented Generation (RAG)
  • Local Large Language Models (LLMs)
  • ChromaDB vector retrieval
  • Cross-Encoder reranking
  • OWASP Top 10 knowledge grounding

All inference executes locally using quantized open-source models, ensuring compliance with data sovereignty requirements such as HIPAA, PCI-DSS, and NIST SP 800-171.


Research Paper

Title: Empirical Evaluation of Retrieval-Augmented Generation for Sovereign Vulnerability Assessment in Air-Gapped Environments

Authors:

  • Kislay Mishra
  • Priyanshu Bajpai
  • Madhav Goyal
  • Neha Gupta

Key Features

Zero-Egress Security

  • No cloud APIs
  • No external network calls
  • Fully local inference

Parent-Child RAG Architecture

  • 2,000-character parent documents
  • 400-character child chunks
  • 50-character overlap
  • ChromaDB vector storage

Multi-Stage Retrieval

  1. Dense vector retrieval
  2. Top-k candidate selection
  3. Cross-Encoder reranking
  4. Parent document expansion

Local LLM Inference

Supported models:

  • Llama 3 8B (Q4)
  • Gemma 2 9B (Q4)

Served through Ollama.

Security Knowledge Base

  • OWASP Top 10 (2025)
  • NVD CVE Database
  • High-severity vulnerabilities (CVSS ≥ 8)

System Architecture

Nmap Scan
    │
    ▼
XML Parser
    │
    ▼
FastAPI Service
    │
    ▼
Parent-Child RAG
    │
 ┌──┴─────────┐
 │ ChromaDB  │
 │ Reranker  │
 └──┬─────────┘
    │
    ▼
Local LLM
    │
    ▼
Remediation Report

Experimental Results

Champion Configuration

Component Configuration
Embedding Model all-MiniLM-L6-v2
Chunk Size 400
Overlap 50
Retrieval Top-15
Reranker ms-marco-MiniLM-L-6-v2
Final Context Top-3 Parents

Retrieval Performance

  • Hit Rate@3: 68.57%
  • MRR: 0.4333

Key Findings

  • Cross-Encoder reranking produced the largest improvement.
  • Domain-specific embeddings underperformed general embeddings.
  • Long-context loading caused complete attention collapse at ~32k tokens.
  • Context windows above 8k tokens significantly increased latency.

Technology Stack

Backend

  • Python
  • FastAPI

Retrieval

  • LangChain
  • ChromaDB
  • Sentence Transformers

Security Tools

  • Nmap

LLM Inference

  • Ollama
  • Llama 3 8B
  • Gemma 2 9B

Evaluation

  • RSCORE Framework
  • Hit Rate@3
  • Mean Reciprocal Rank (MRR)

Installation

Clone Repository

git clone https://github.com/your-username/SovereignRAG.git
cd SovereignRAG

Create Environment

python -m venv venv

source venv/bin/activate
# Linux/Mac

venv\Scripts\activate
# Windows

Install Dependencies

pip install -r requirements.txt

Install Ollama

ollama pull llama3:8b
ollama pull gemma2:9b

Running the System

Start Backend

uvicorn app.main:app --reload

Submit Nmap Scan

nmap -sV target-ip -oX scan.xml

Upload the generated XML to the API endpoint for remediation analysis.


Future Work

  • MITRE ATT&CK integration
  • Vendor advisory ingestion
  • Multi-hardware benchmarking
  • Improved RSCORE validation
  • Agentic remediation workflows

Citation

If you use this work, please cite:

@article{sovereignrag2026,
  title={Empirical Evaluation of Retrieval-Augmented Generation for Sovereign Vulnerability Assessment in Air-Gapped Environments},
  author={Mishra, Kislay and Bajpai, Priyanshu and Goyal, Madhav and Gupta, Neha},
  year={2026}
}

License

MIT License

About

VAPT-X , SovereignRAG is a fully local, zero-egress vulnerability assessment system designed for regulated and air-gapped environments where sensitive network telemetry cannot be transmitted to external cloud services.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors