AURA

Your computer already has an AI. It just doesn't know it yet.

Not another chatbot. AURA is a system-level AI that lives on your machine, executes real actions, and never phones home.

No cloud. No API keys. No subscriptions. No data leaving your machine. Ever.

Get Started · What It Can Do · Architecture · Roadmap · Contribute

The Problem

Every "AI assistant" today is a chat window connected to someone else's server.

You type. It responds. That's it.

You can't tell it to create a file on your desktop. You can't ask it to kill a runaway process. You can't say "open Chrome" and have it happen. You can't speak a command and hear the answer.

ChatGPT can't touch your filesystem. Copilot can't monitor your CPU. AutoGPT burns through API credits and still can't move a file.

AURA doesn't chat about doing things. It does them.

What Makes AURA Different

	ChatGPT / Copilot	AutoGPT / AgentGPT	AURA
Runs locally	Cloud-only	Needs API keys	Fully offline with Ollama
Executes system actions	Chat only	Unreliable	File, process, shell, voice
Voice interface	No	No	Whisper STT + TTS pipeline
Privacy	Data sent to servers	Data sent to servers	Nothing leaves your machine
Security model	N/A	None	Sandboxed, audited, policy-enforced
Cost	$20/mo+	API credits	Free forever
Works offline	No	No	100% offline capable

What AURA Can Do

Phase 1 — System Control (CLI)

> create file desktop/notes.txt
File created: C:\Users\You\Desktop\notes.txt

> cpu
CPU: 23.4%

> kill process chrome
Process 'chrome' terminated.

> create project desktop/my-app
Project 'my-app' created with src/ tests/ README.md .gitignore requirements.txt

> run command git status

Phase 2 — Voice + Intelligence (Current)

AURA now hears you, thinks locally, speaks back, and executes real actions — powered entirely by local models.

"Hey Jarvis, create a folder named project on desktop"
  → Wake word + command extracted in one step
  → Intent: SYSTEM_COMMAND (regex, 0ms)
  → Folder created instantly
  → TTS: "Folder project created on Desktop."

"Hey Jarvis, what is Python?"
  → Intent: GENERAL_KNOWLEDGE (regex, 0ms)
  → Streams response from llama3.2:1b
  → TTS speaks first sentence in ~3s

"Hey Jarvis, open Chrome"
  → Intent: SYSTEM_COMMAND
  → Chrome opens immediately
  → TTS: "Opening Chrome."

[CTRL+SPACE] → "Write a Python function to sort a list"
  → Intent: CODE_GENERATION
  → Streams from deepseek-coder:6.7b

Voice pipeline flow:

Wake ("Hey Jarvis") → Command Extraction → Regex Intent (0ms) → Execute / Stream LLM → TTS

System commands execute directly — no LLM round-trip:

Voice Command	What Happens
"Create a folder named X on desktop"	Creates the folder instantly
"Delete file X from documents"	Deletes the file
"Open Chrome / Notepad / any app"	Launches the application
"Kill process chrome"	Terminates the process
"CPU" / "RAM"	Speaks current system usage

7 intent types classified instantly via regex (no LLM call):

Intent	Routed To	Example
`SYSTEM_COMMAND`	Direct execution	"Create folder", "Open Chrome", "Kill process"
`CODE_GENERATION`	deepseek-coder:6.7b	"Write a REST endpoint in FastAPI"
`GENERAL_KNOWLEDGE`	llama3.2:1b (streamed)	"Explain Docker networking"
`DEV_TASK`	llama3.2:1b	"Push my code to GitHub"
`VISION_TASK`	llava:7b	"What's on my screen?"
`PROJECT_CONTEXT`	llama3.2:1b	"What routes does my project have?"
`REALTIME_QUERY`	llama3.2:1b	"What's the latest Node.js version?"

Architecture

┌─────────────────────────────────────────────────────┐
│                    INPUT LAYER                       │
│    CLI · "Hey Jarvis" (Whisper) · CTRL+SPACE        │
├─────────────────────────────────────────────────────┤
│               VOICE PIPELINE (Phase 2)              │
│  VAD → Wake Word → Whisper STT → Intent Router     │
├─────────────────────────────────────────────────────┤
│                 REASONING LAYER                      │
│   OllamaClient (6 local models) · Intent Classifier │
├─────────────────────────────────────────────────────┤
│                 SECURITY LAYER                       │
│    Sandbox · Policy · Permissions · Audit Chain      │
├─────────────────────────────────────────────────────┤
│                EXECUTION LAYER                       │
│   Isolated Worker Process · Plugin Registry · IPC    │
├─────────────────────────────────────────────────────┤
│                  PLUGIN LAYER                        │
│  System · Git · Docker · Browser · Gmail · Spotify   │
│  Vision · Weather · Calendar · Memory                │
├─────────────────────────────────────────────────────┤
│                  OUTPUT LAYER                        │
│       Console · TTS (Edge/Piper/pyttsx3) · EventBus │
└─────────────────────────────────────────────────────┘

Wake word detection (three-tier fallback):

Tier	Engine	How it works
1 (default)	Whisper keyword spotting	VAD detects speech → records 1.5s → Whisper transcribes → matches "Hey Jarvis" + extracts command
2	openwakeword	Lightweight ONNX model (auto-fallback if Whisper unavailable)
3	CTRL+SPACE	Keyboard hotkey — always works alongside any voice tier

Performance optimizations:

Single-shot wake + command — "Hey Jarvis, what is Python?" is captured in one recording, no second prompt
Regex-only intent classification — 0ms classification, no LLM round-trip
Streaming LLM responses — TTS speaks the first sentence while the model is still generating
Model pre-warming — primary model is loaded into RAM at startup for instant inference
System commands bypass LLM entirely — file/folder/app operations execute directly

Key design decisions:

The main process never imports plugin code — plugins run in isolated worker subprocesses over JSON IPC
EventBus connects all modules via 18 typed events — no direct coupling
ModeMonitor detects online/offline and switches TTS engines automatically
TTS failover chain: Edge TTS (online) → Piper (offline) → pyttsx3 (fallback)
Wake word shares the Whisper model with STT — zero additional memory cost
All config is centralized in config.yaml — no hardcoded values in source

Getting Started

Prerequisites

Python 3.10+
Ollama installed and running (ollama.com)

Install

git clone https://github.com/aryanjsx/AURA.git
cd AURA
pip install -r requirements.txt

Pull the Ollama models

ollama pull llama3.2:1b            # Primary (fast voice responses)
ollama pull llama3.2:3b            # Reasoning fallback
ollama pull deepseek-coder:6.7b    # Code generation
ollama pull llava:7b               # Vision (Phase 4)
ollama pull nomic-embed-text:latest # Embeddings (Phase 6)

If your models are stored in a custom location (e.g., D:\ollama\models):

$env:OLLAMA_MODELS="D:\ollama\models"

Run

# Phase 2 — Voice pipeline (full experience)
python main.py

# Phase 1 — CLI mode (text commands only)
python -m aura
python -m aura --yes "cpu"

Say "Hey Jarvis" to activate voice input, or press CTRL+SPACE as a manual fallback. Speak your command and AURA responds.

Quick Reference

Voice commands (say "Hey Jarvis" then speak naturally):

Category	Voice Examples
Files	"Create a folder named project on desktop", "Delete file notes.txt from documents"
Apps	"Open Chrome", "Open Notepad", "Launch VS Code"
System	"CPU", "RAM", "Kill process chrome"
Questions	"What is Python?", "Explain Docker networking"
Code	"Write a function to sort a list"

CLI commands (via python -m aura):

Category	Commands
Files	`create file`, `delete file`, `rename file`, `move file`, `search files`
System	`cpu`, `ram`, `list processes`, `check system health`, `kill process`
Projects	`create project <path>`
Shell	`run command <cmd>` (allowlisted: git, npm, docker)
npm	`npm install [path]`, `npm run <script>`

Project Structure

AURA/
├── aura/
│   ├── core/
│   │   ├── config_loader.py    # YAML config with strict validation
│   │   ├── ollama_client.py    # Ollama API client with streaming
│   │   ├── intent_router.py    # Regex-based intent classification
│   │   ├── voice_executor.py   # Direct system command execution
│   │   ├── errors.py           # Custom exception hierarchy
│   │   └── ...
│   ├── modules/
│   │   ├── stt.py              # Whisper speech-to-text engine
│   │   ├── tts.py              # Multi-engine text-to-speech
│   │   └── wake_word.py        # Whisper-based wake word + CTRL+SPACE
│   ├── utils/
│   │   ├── audio_input.py      # Microphone device resolution
│   │   ├── event_bus.py        # Singleton pub/sub with 18 event types
│   │   └── mode_monitor.py     # Online/offline detection daemon
│   ├── security/               # Sandbox, audit, policy enforcement
│   └── runtime/                # Execution engine, planner, worker IPC
├── plugins/
│   ├── system/                 # File, process, shell operations
│   ├── git/                    # Git automation
│   ├── docker/                 # Docker lifecycle management
│   ├── browser/                # Web automation (Playwright)
│   ├── vision/                 # Screen capture + LLaVA
│   ├── gmail/                  # Email integration
│   ├── spotify/                # Music control
│   ├── calendar/               # Calendar events
│   ├── weather/                # Weather queries
│   └── memory/                 # ChromaDB semantic memory
├── tests/
│   ├── test_phase2_audit_part1.py  # EventBus, ModeMonitor, Ollama, Router
│   ├── test_phase2_audit_part2.py  # STT, WakeWord, TTS, Config, Safety
│   └── fixtures/               # Test audio files, bad config
├── scripts/                    # Diagnostic and integration test scripts
├── config.yaml                 # Central configuration
├── main.py                     # Phase 2 voice pipeline entry point
└── requirements.txt

Test Suite

The adversarial audit suite covers every module with both happy-path and edge-case tests:

Section	Tests	Status
EventBus (happy + adversarial)	14	All pass
ModeMonitor (happy + adversarial)	7	All pass
OllamaClient (happy + adversarial)	8	All pass
IntentRouter + IntentObject	13	All pass
STTEngine (happy + adversarial)	13	All pass
WakeWordListener (happy + adversarial)	11	All pass
TTSEngine (happy + adversarial)	9	All pass
Config validation	2	All pass
Safety (static analysis)	5	All pass
Pipeline E2E	1	All pass
Regression guards	5	All pass

Security verified: No shell=True, no eval/exec, no subprocess string injection, no audio persisted to disk, all layer boundaries enforced.

Roadmap

Phase	What Ships	Status
Phase 0 — Core Infrastructure	Event bus, config, registry, CLI, execution backbone	Done
Phase 1 — System Plugin	File/process/npm operations, sandbox, permissions, audit chain	Done
Phase 2 — Voice + Intelligence	Whisper STT, Ollama LLM routing, TTS, intent classification	Done
Phase 3 — Dev Tools	Git automation, Docker lifecycle, browser automation	Next
Phase 4 — Vision	Screen capture, OCR, visual reasoning with LLaVA	Planned
Phase 5 — GUI Dashboard	PyQt6 desktop interface with live command log	Planned
Phase 6 — Memory + RAG	ChromaDB semantic memory, conversation history	Planned
Phase 7 — Browser Automation	Sandboxed web research with Playwright	Planned
Phase 8 — Integrations	Spotify, Weather, Calendar, Gmail bridges	Planned

Philosophy

"If it needs the internet to think, it's not your AI."

Local-first — No cloud dependency. No API keys. Works on airplane mode.
Actions over answers — AURA doesn't explain how to create a file. It creates the file.
Security is non-negotiable — Sandboxed execution, tamper-evident audit logs, hash-chained integrity.
Modular by design — Every capability is a plugin. Add what you need. Remove what you don't.
Developer-owned — Open source. No telemetry. No tracking. Your machine, your rules.

Contributing

We're building something big and we want you in.

Fork the repo
Create your branch (git checkout -b feat/amazing-feature)
Commit with Conventional Commits (feat(core): add amazing feature)
Push and open a Pull Request

See CONTRIBUTING.md for full guidelines. Check out open issues — look for good first issue and help wanted.

Active areas where we need help:

Plugin development (Git, Docker, Browser, Gmail, Spotify)
Ollama prompt engineering for developer tasks
Cross-platform testing (macOS, Linux)
Test coverage expansion
GUI dashboard design (Phase 5)

Star This Repo

If AURA's vision resonates with you — an AI that runs locally, executes real actions, and respects your privacy — drop a star.

It takes one second and tells us you believe AI should be owned, not rented.

AURA — Autonomous Unified Response Architecture

Built offline. Powered locally. Yours completely.

GitHub · Issues · Contributing · Roadmap

MIT License — Built by @aryanjsx

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.github		.github
aura		aura
docs		docs
plugins		plugins
public		public
scripts		scripts
tests		tests
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
config.example.yaml		config.example.yaml
config.yaml		config.yaml
main.py		main.py
plugins_manifest.yaml		plugins_manifest.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AURA

Your computer already has an AI. It just doesn't know it yet.

The Problem

What Makes AURA Different

What AURA Can Do

Phase 1 — System Control (CLI)

Phase 2 — Voice + Intelligence (Current)

Architecture

Getting Started

Prerequisites

Install

Pull the Ollama models

Run

Quick Reference

Project Structure

Test Suite

Roadmap

Philosophy

Contributing

Star This Repo

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AURA

Your computer already has an AI. It just doesn't know it yet.

The Problem

What Makes AURA Different

What AURA Can Do

Phase 1 — System Control (CLI)

Phase 2 — Voice + Intelligence (Current)

Architecture

Getting Started

Prerequisites

Install

Pull the Ollama models

Run

Quick Reference

Project Structure

Test Suite

Roadmap

Philosophy

Contributing

Star This Repo

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages