Forge Assistant

⚠️ Status: Under active development — not yet production-ready. The AI assistant is shipped as a preview to gather early feedback. APIs, models, default prompts, and capabilities may change between releases. Do not depend on it for critical workflows. The path out of preview and the criteria for General Availability are tracked in the GA Roadmap; operational recovery is covered in Disaster Recovery.

AI-powered assistant for the Forge infrastructure automation platform. Uses a local Ollama LLM with RAG (Retrieval-Augmented Generation) to provide contextual help, error analysis, and documentation search.

Overview

Forge Assistant is an optional, standalone service that can be plugged into or removed from any Forge deployment. It runs as a single all-in-one container with Ollama (LLM) and ChromaDB (embedded) bundled inside.

┌──────────────────┐     ┌──────────────────────────────────────┐
│  Forge Frontend  │────▶│         Forge Assistant               │
│  (React chat)    │ SSE │  ┌──────────┐  ┌──────────────────┐  │
└──────────────────┘     │  │  Ollama   │  │    FastAPI        │  │
                         │  │ gemma3:1b │  │  (RAG pipeline)   │  │
                         │  └──────────┘  └────────┬──────────┘  │
                         │                ┌────────▼──────────┐  │
                         │                │  ChromaDB (embed)  │  │
                         │                └───────────────────┘  │
                         └──────────────────────────────────────┘

Features

Contextual help — knows which page the user is on
Documentation search — RAG-powered answers from indexed Forge/Ansible docs
Error explanation — analyze failed job output
Streaming responses — token-by-token display via Server-Sent Events
Privacy-first — all data stays on your server, no cloud APIs

Quick Start

# Start the assistant (all-in-one: Ollama + ChromaDB + FastAPI)
docker compose up -d

# Wait ~2 minutes for Ollama to load the model on first start,
# then index documentation
curl -X POST http://localhost:8100/api/v1/index

# Test it
curl -X POST http://localhost:8100/api/v1/chat \
  -H 'Content-Type: application/json' \
  -d '{"message": "How do I create a job template?"}'

Note: On first start, the entrypoint automatically pulls the LLM model (gemma3:1b) and embedding model (nomic-embed-text). The healthcheck start_period is 120 seconds to allow time for this.

Integration with Forge

To add the assistant to an existing Forge deployment:

cd /opt/forge
docker compose -f docker-compose.yml -f path/to/forge-assistant/docker-compose.integration.yml up -d

The frontend automatically detects the assistant via health check and shows the chat button.

Configuration

All settings via environment variables with FORGE_ASSISTANT_ prefix:

Variable	Default	Description
`FORGE_ASSISTANT_OLLAMA_BASE_URL`	`http://localhost:11434`	Ollama API URL (localhost — runs inside the same container)
`FORGE_ASSISTANT_OLLAMA_MODEL`	`gemma3:1b`	LLM model
`FORGE_ASSISTANT_OLLAMA_EMBED_MODEL`	`nomic-embed-text`	Embedding model
`FORGE_ASSISTANT_CHROMA_HOST`	`localhost`	ChromaDB host (localhost — embedded in the same container)
`FORGE_ASSISTANT_CHROMA_PORT`	`8000`	ChromaDB port
`FORGE_ASSISTANT_RAG_TOP_K`	`5`	Number of docs to retrieve
`FORGE_ASSISTANT_LOG_LEVEL`	`INFO`	Logging level

Hardware Requirements

Setup	RAM	GPU	Response Time
CPU-only (phi3:mini)	8 GB	None	10-20s
GPU (mistral:7b)	16 GB	8 GB VRAM	2-5s
GPU (llama3.1:8b)	32 GB	12 GB VRAM	1-3s

Development

# Install dependencies
python3.12 -m venv .venv && source .venv/bin/activate
pip install -r requirements-dev.txt

# Run tests
pytest tests/ -v

# Lint
ruff check app/ tests/

# Run dev server
uvicorn app.main:app --reload --port 8100

API

Endpoint	Method	Description
`/api/v1/health`	GET	Health check (Ollama + ChromaDB status)
`/api/v1/chat`	POST	Chat with SSE streaming
`/api/v1/index`	POST	Trigger document re-indexing
`/api/v1/docs`	GET	OpenAPI documentation

Documentation

Architecture
API Reference
Configuration
Deployment
GA Roadmap — preview → GA exit criteria and milestones
Disaster Recovery — ChromaDB index backup, restore, and rebuild

License

Part of the Forge Platform.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github/workflows		.github/workflows
app		app
docs		docs
docs_to_index		docs_to_index
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
VERSION		VERSION
docker-compose.integration.yml		docker-compose.integration.yml
docker-compose.yml		docker-compose.yml
entrypoint.sh		entrypoint.sh
pytest.ini		pytest.ini
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Forge Assistant

Overview

Features

Quick Start

Integration with Forge

Configuration

Hardware Requirements

Development

API

Documentation

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Forge Assistant

Overview

Features

Quick Start

Integration with Forge

Configuration

Hardware Requirements

Development

API

Documentation

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages