CONTAGION: AI Agent Mesh Security Demonstration

This project is a security demonstration showing an automated AI agent mesh, prompt injection propagation (simulating the Morris II worm model), and zero-trust security controls.

Design

The system processes incoming data (such as emails) using a pipeline of 8 specialized AI agents (Email, Calendar, Code, Finance, HR, CRM, Search, File). Security is enforced by a Judge Agent at the gateway:

Zero Trust Policy: When enabled, the Judge Agent scans incoming messages for adversarial prompt injection signatures and quarantines threats.
Cascading Compromise: When disabled, prompt injection payloads propagate sequentially through the agent pipeline, showing how a single compromised node can infect the entire network.

Key Features

Three-Phase Pipeline: Structured workflow separation (Data Ingestion -> Security Evaluation -> Domain Processing).
Model-as-a-Judge: Gatekeeper agent implementing low-temperature classification for adversarial input detection.
8 Core Business Agents: Real operational functions running independent LLMs.
Worm Propagation Simulation: Demonstrates adversarial context-copying prompt injection based on arXiv:2403.02817 research.
Real-time Telemetry Dashboard: Frontend displaying simulation states, simulated data exfiltration logs (illustrative egress for visualization), and propagation generation metrics via Server-Sent Events (SSE).

Technical Stack

Frontend: Next.js 16 (App Router), Zustand, Prisma ORM, SQLite.
Backend: FastAPI, Google Agent Development Kit (ADK) Python SDK.
LLM Engine: Google Gemini API (models/gemini-2.5-pro for the Judge, models/gemini-2.5-flash-lite for the business pipeline).
Environment: Docker and Docker Compose.

Installation and Execution

Configure the .env file in the project root:

DATABASE_URL="file:./prisma/contagion.db"
GOOGLE_GENAI_USE_VERTEXAI=0
GEMINI_API_KEY=your_gemini_api_key_here

Start the services:
```
docker compose up --build
```

Frontend UI: http://localhost:3000
FastAPI API Docs: http://localhost:8000/docs

Simulation Scenarios

Scenario 1: Judge Block (Adversarial Payload + Security Gate)

Request Configuration: judgeEnabled = true, useWorm = true.
Result: The Judge Agent identifies the attack, triggers a shield_alert, and halts execution (worm_blocked).

Scenario 2: Pipeline Compromise (Adversarial Payload + Gate Bypassed)

Request Configuration: judgeEnabled = false, useWorm = true.
Result: The payload propagates through all 8 agents (Gen 1 through Gen 8), exposing domain-specific credentials and data.

Scenario 3: Clean Execution (Standard Input + Security Gate)

Request Configuration: judgeEnabled = true, useWorm = false.
Result: The message is scanned, cleared, and runs through the 8 agents successfully.

Architecture and Design Decisions

Detailed architectural specifications, data flow diagrams, and threat patterns are documented in the ARCHITECTURE.md reference.

Limitations & Evaluation

Measured Performance

We evaluated the zero-trust Judge Agent (configured with intent-based detection guidelines) against a test suite of clean and adversarial inputs (run_eval.py). The results are saved in results.md:

Detection Rate (True Positive Rate): 100.0% (13/13 malicious emails detected, including paraphrases, ROT13, hex, reversed text, Swahili language, homoglyphs, and smuggling attacks)
False Positive Rate: 12.5% (1/8 clean emails flagged, specifically an email requesting to forward a signed document to an external vendor partner)

Key Security Limitations

LLM Evasion (LLM-as-a-Judge Vulnerabilities): The Judge Agent is itself a Large Language Model running under resource and prompt constraints. While it successfully classifies the evaluation cases, it is susceptible to advanced prompt injection and evasion techniques:
- Obfuscation: Paraphrasing payload directives to avoid keyword detection.
- Encoding: Obfuscating payloads using Base64, hex, or custom ciphers.
- Fragmentation: Splitting the injection string across multiple messages to evade context windows.
- Direct Injection: Crafting overrides targeting the Judge's classification prompts directly rather than the downstream agents.
Simulated Exfiltration: The data exfiltration and "data exposed" keys shown in the dashboard (e.g., exposed API keys, employee PII, invoices) are simulated illustrative data-egress categories triggered for visualization purposes. No real business data or credentials are exfiltrated to external destinations.

References

Research Paper: Stav Cohen, Ron Bitton, and Ben Nassi. "Here Comes The AI Worm: Unleashing Zero-click Worms that Target GenAI-Powered Applications." arXiv preprint arXiv:2403.02817 (2024).

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
backend		backend
eval		eval
frontend		frontend
screenshots		screenshots
tests		tests
.env.example		.env.example
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
DEMO.md		DEMO.md
README.md		README.md
contagion.json		contagion.json
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CONTAGION: AI Agent Mesh Security Demonstration

Design

Key Features

Technical Stack

Installation and Execution

Simulation Scenarios

Scenario 1: Judge Block (Adversarial Payload + Security Gate)

Scenario 2: Pipeline Compromise (Adversarial Payload + Gate Bypassed)

Scenario 3: Clean Execution (Standard Input + Security Gate)

Architecture and Design Decisions

Limitations & Evaluation

Measured Performance

Key Security Limitations

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CONTAGION: AI Agent Mesh Security Demonstration

Design

Key Features

Technical Stack

Installation and Execution

Simulation Scenarios

Scenario 1: Judge Block (Adversarial Payload + Security Gate)

Scenario 2: Pipeline Compromise (Adversarial Payload + Gate Bypassed)

Scenario 3: Clean Execution (Standard Input + Security Gate)

Architecture and Design Decisions

Limitations & Evaluation

Measured Performance

Key Security Limitations

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages