Skip to content

thatcooperguy/nvHive

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

469 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nvHive

One curl command turns a rented Linux GPU desktop — GeForce NOW, RunPod, Lambda, Vast — into a working AI lab. No root, no Docker, survives reconnects.

PyPI License CI

curl -sSL https://raw.githubusercontent.com/thatcooperguy/nvHive/main/install.sh | bash

That one command gives you:

  • A local multimodal LLM, auto-picked for your GPU's VRAM, served by Ollama — chat and image understanding with nothing leaving the machine
  • A web dashboard at localhost:3000 with an AI Wizard that knows your workspace and can fix it
  • A multi-LLM router across 23 providers (Ollama, Groq, Gemini, NVIDIA NIM, OpenAI, Anthropic, ...) — many with free tiers
  • Persistent storage layout under NVH_HOME, so models and chats survive when the cloud desktop resets

Renting the GPU by the hour? nvHive installs into user-owned paths on the persistent volume and verifies every service is healthy before it opens your browser. You pay for GPU time, not for debugging time.


What happens when you run it

The installer detects your GPU and persistent storage, then walks through a visible, skippable flow:

1. Model download countdown. You're told what's downloading, how big it is, and how to skip:

AI Wizard local brain: llama3.2-vision (~7.9 GB)
  This is the model the Wizard chats with. Smaller models load fast on CPU;
  bigger ones are stronger on GPU. You can change it later from the WebUI.
  Starting in 10s... press [s] to skip

Models are picked by VRAM: moondream (~1.7 GB, runs on CPU) → minicpm-v (12 GB+) → llama3.2-vision (16 GB+) → NVIDIA nemotron-3-nano-omni (24 GB+) → nemotron-omni (40 GB+). If a pull fails, the installer falls through to the next smaller model instead of dying. Headless installs can opt out with NVH_INSTALL_MODEL_DOWNLOAD=0.

2. Verified bring-up. Services start in dependency order with real health gates, shown live:

                       nvHive bring-up
 Service                    Port    Status     Detail
 Local AI brain (Ollama)    11434   ✓ ready    /api/tags responding
 nvHive backend (API)       8000    ✓ ready    /v1/health ok
 Web dashboard (WebUI)      3000    ✓ ready    serving
 End-to-end test            —       ✓ ready    Wizard answered

The fourth row is a real smoke test: it POSTs a chat message to the Wizard and waits for an answer.

3. Browser opens only on green. Your first sight of the dashboard is a working dashboard — never a red "API offline" banner. If anything fails, you get the failing step, the log path, and the last 25 lines of that log inline.


What you can do with it

Local AI, immediately

The dashboard's Wizard chat runs against your local model first — $0, private, offline-capable. The CLI works the same way:

nvh "summarize this error log"      # routes to the best available model
nvh safe "review this contract"     # local only — nothing leaves the machine

AI Wizard

A streaming, tool-using assistant that reads live workspace state. It can refresh models, repair the workspace, RAG over files you drag into the chat (PDFs included), and search the web — citing sources and showing cost and latency per response. Slash commands in chat: /help, /save, /pin, /clear, /tools.

Multi-provider routing

One interface over 23 providers and 63 models. Requests are scored on capability, cost, latency, and provider health, then routed — free tiers first when you have no keys, your GPU first when you do have one. Add keys with nvh setup. Provider guide

Agents and council mode

Six built-in agent profiles (Wizard, Coder, Researcher, Writer, Ops, Vault-RAG), each mappable to a local or cloud model; your own profiles live in $NVH_HOME/agent-profiles/. Council mode runs one question through multiple models in parallel and synthesizes the answers:

nvh convene "Redis or Postgres for session storage?"   # multi-model deliberation
nvh agent "add unit tests for auth" --dir ./myproject  # agentic coding with QA

Council docs · Agent tools

Creative and studio packs

Rootless one-command installs for ComfyUI, Blender, game-dev tooling, and music production (stem splitting, transcription, generation):

nvh studio --list
nvh studio --install comfy -y
nvh studio --install creative -y
nvh studio --install music -y

Built for machines that disappear

Cloud GPU desktops reset. nvHive plans for it:

  • Everything that matters lives under NVH_HOME on the persistent volume — models, config, chats, vault, logs, jobs
  • Long downloads run as resumable jobs that survive browser refreshes and reconnects
  • /pin a conversation and a Welcome Back panel resumes it on the next session
  • nvh snapshot save tarballs your state; nvh snapshot restore resumes it on a brand-new VM

If your persistent mount isn't auto-detected, set it before installing:

export NVH_HOME=/mnt/persist/nvhive

Requirements

  • Linux x86_64 (the primary target; Windows and macOS installers exist — see Releases)
  • No root. Everything installs to user-owned paths. No Docker required.
  • Python 3.11+ — or none at all: the installer can fetch a single-file binary (NVH_USE_BINARY=1)
  • GPU optional. CPU-only machines get moondream locally plus cloud free tiers. An NVIDIA GPU unlocks the larger local models.
  • Disk: ~2 GB minimum for the smallest local model; up to ~35 GB for the largest tier. The installer checks free space and tells you sizes before downloading.

Already have a Python environment? pip install nvhive (extras: [vision], [browser], [rag], [all]).


When something breaks

Three places to look, in order:

nvh services status        # per-service health table
nvh services smoke-test    # "can the Wizard actually answer?" end-to-end check
nvh doctor                 # full diagnostic

In the dashboard, the Debug Report button generates a redacted report (secrets and local paths stripped) you can paste into an issue. Logs live under $NVH_HOME/logs/ (ollama.log, api-server.log, model-pull.log). nvh services restart recycles the stack; nvh repair runs safe rootless fixes.


Command reference

Command What it does
nvh "question" Route to the best available model
nvh safe "question" Local inference only
nvh convene "question" Multi-model council with synthesis
nvh agent "task" Agentic coding with review loop
nvh webui Open the dashboard
nvh services start Verified bring-up (Ollama → API → WebUI → smoke test)
nvh services stop Stop the stack (keeps Ollama's warm model cache)
nvh studio --install <pack> -y Install a rootless tool pack
nvh snapshot save / restore Persist state across ephemeral VMs
nvh setup Configure providers and keys

Full reference: docs/COMMANDS.md

Documentation

Guide What's inside
Linux GPU Desktop The no-root cloud workstation path in depth
GPU Tier Matrix Which capabilities unlock at which VRAM
Providers All 23 providers, free tiers, rate limits
Council Multi-LLM deliberation design
Architecture Routing, layers, system design
SDK & API Python SDK, REST API, OpenAI/Anthropic-compatible proxies
Configuration Every knob, including NVH_HOME and install env vars

Notes

  • Cloud providers receive the queries you route to them, under their own privacy policies. Use nvh safe to keep inference local.
  • AI output can be wrong. Review agent-modified files before shipping them.

License

MIT — see LICENSE and NOTICE. The MIT license does not grant rights to the nvHive name, logos, or publishing identities; forks should use distinct names and channels. See TRADEMARKS.

About

Rootless NVIDIA GPU AI lab: local LLMs, ComfyUI, agents, creative/game/music packs, and a self-healing setup wizard.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors