A cross-platform CLI tool for managing multiple llama.cpp versions on your machine.
# Install latest stable version
lvm install latest
# Switch to a specific version
lvm use b3412-cuda
# Interactive version picker
lvm use # arrow-key selection
lvm install # arrow-key selection
# List all installed versions
lvm lslvm is a lightweight version manager that simplifies working with multiple builds of llama.cpp. It handles:
- Installation of llama.cpp releases from GitHub
- Version switching between different builds (CPU, CUDA, Metal, Vulkan, etc.)
- Channel management between stable and beta releases
- Automatic shims for easy command invocation
- Cross-platform support (Windows, Linux, macOS)
Think of it like nvm (Node Version Manager) but for llama.cpp.
| Feature | Description |
|---|---|
| Multiple versions | Install and switch between any llama.cpp release |
| GPU backends | Support for CPU, CUDA, Metal, Vulkan, ROCm, OpenVINO, SYCL |
| Stable & Beta | Separate channels for production and bleeding-edge builds |
| Interactive picker | Arrow-key selection UI for use, install, and uninstall |
| One-command init | Automatic PATH configuration, zero manual setup |
| Auto-shims | All llama.cpp binaries become accessible via simple commands |
| Cross-platform | Works on Windows, Linux, and macOS |
| Cache | GitHub releases are cached for 6 hours to avoid repeated API calls |
| Refresh | Run lvm fetch to manually refresh cached data before TTL expires |
| Clean uninstall | Remove versions without leaving artifacts |
# Linux/macOS
curl -sSL https://github.com/asertym/lvm/releases/latest/download/install.sh | sh
# Windows (PowerShell)
# Download from https://github.com/YOURNAME/lvm/releases and run install.ps1# Download the binary for your platform
wget https://github.com/asertym/lvm/releases/latest/download/lvm-linux-amd64
# Move to a location in your PATH
sudo mv lvm-linux-amd64 /usr/local/bin/lvm
# Initialize
lvm initgit clone https://github.com/asertym/lvm.git
cd lvm
go build -o lvm .
sudo mv lvm /usr/local/bin/lvm
lvm init# 1. Initialize (run once)
lvm init
# 2. Install a version (interactive picker by default)
lvm install
# 3. Start using llama.cpp commands
llama-cli --help
llama-quantize model.gguf q4_0.gguf# Latest stable release
lvm install latest
# Latest beta/pre-release
lvm install latest-beta
# Specific build number
lvm install b3412
# With explicit GPU backend
lvm install latest --backend cuda
lvm install b3412 --backend vulkan
lvm install latest --backend metal
# Interactive picker (default when no version given)
lvm install # picks from releases list
lvm install -i # same, explicit flag
lvm install latest # non-interactiveAvailable backends:
cpu— CPU-only buildcuda— NVIDIA CUDA GPU accelerationmetal— Apple Metal (macOS)vulkan— Vulkan APIrocm— AMD ROCmopenvino— Intel OpenVINOsycl-fp16/sycl-fp16— AMD SYCL
# Switch to a specific installed version
lvm use b3412-cuda
# Interactive picker (default when no version given)
lvm use # picks from installed list
lvm use -i # same, explicit flag
lvm use b3412-cuda # non-interactive
# Switch to the stable channel (uses last stable version)
lvm channel stable
# Switch to the beta channel (uses last beta version)
lvm channel beta# List all locally installed versions
lvm ls
# List available releases on GitHub
lvm ls-remote
# Show current active version
lvm current# Manually refresh cached GitHub release data
lvm fetchBy default, releases are cached for 6 hours. Use lvm fetch to force a refresh before the cache expires.
# Check for updates to the active version
lvm update# Remove a specific version
lvm uninstall b3412-cuda
# Interactive picker (default when no version given)
lvm uninstall # picks from installed list
lvm uninstall -i # same, explicit flag
lvm uninstall b3412-cuda # non-interactive# Show lvm version
lvm version
# Show currently active version details
lvm current# Clone the repo and build
git clone https://github.com/asertym/lvm.git
cd lvm
go build -o lvm .
sudo mv lvm /usr/local/bin/lvm
# Initialize and install
lvm init
lvm install latest
# Verify
lvm current
llama-cli --version# Try CUDA (if available)
lvm install latest --backend cuda
lvm use latest-cuda
# Fall back to Vulkan if CUDA fails
lvm uninstall latest-cuda
lvm install latest --backend vulkan
lvm use latest-vulkan
# CPU fallback
lvm uninstall latest-vulkan
lvm install latest --backend cpu
lvm use latest-cpu# Install and use stable (default)
lvm install latest
lvm use latest-cpu
# Later, try beta features
lvm install latest-beta
lvm channel beta
# Back to stable when ready
lvm channel stable# Project A uses older stable version
lvm use b3200-cuda
# Project B needs latest features
lvm use b3412-cuda
# Project C needs specific build
lvm install b3150
lvm use b3150-cpu# Pick an installed version with arrow keys
lvm use
# Browse all releases and install one
lvm install
# Remove a version (active version is protected from removal)
lvm uninstall~/.lvm/
├── active # Currently active version ID (e.g., "b3412-cuda")
├── channels.json # Channel state (stable/beta → version IDs)
├── cache/ # Cached GitHub release data (6-hour TTL)
│ └── releases_cache.json
├── shims/ # Auto-generated shell scripts
│ ├── llama-cli
│ ├── llama-server
│ ├── llama-bench
│ ├── llama-quantize
│ └── ...
└── versions/ # Installed llama.cpp versions
├── b3412-cuda/
│ ├── llama-cli
│ ├── llama-server
│ ├── ...
│ └── manifest.json
└── b3200-cpu/
├── main
├── ...
└── manifest.json
Versions are identified by unique IDs combining the build tag and backend:
b3412-cuda # Build 3412 with CUDA backend
b3200-cpu # Build 3200 with CPU backend
b3150-metal # Build 3150 with Metal backend
lvm creates shell script wrappers (shims) for each llama.cpp binary:
# On Unix-like systems
llama-cli → ~/.lvm/shims/llama-cli
→ checks ~/.lvm/active
→ executes ~/.lvm/versions/<active>/llama-cli
# On Windows
llama-cli.cmd → %LVM_HOME%\shims\llama-cli.cmd
→ checks %LVM_HOME%\active
→ executes %LVM_HOME%\versions\<active>\llama-cli.exeWhen called without a version argument (or with -i), install, use, and uninstall use a TUI picker built with charmbracelet/huh:
- Arrow keys to navigate
- Enter to confirm
- Ctrl+C to abort cleanly
- Installed versions and beta releases are visually marked
- Active version cannot be removed in
uninstall
The picker also adapts to non-TTY contexts (piped input, CI) by switching to accessible keyboard-only mode.
Two channels track the "default" version for each track:
{
"stable": "b3412-cuda",
"beta": "b3500-cpu"
}| Variable | Description |
|---|---|
LVM_HOME |
Override default ~/.lvm location |
export LVM_HOME=/opt/lvm
lvm initOn Windows, lvm init automatically adds the shims directory to your user PATH via the Registry, ensuring it survives terminal restarts.
# Solution: Install a version first
lvm install latest
lvm use latest-cpu# Check active version
lvm current
# Re-initialize shims
rm ~/.lvm/shims/*
lvm init
# Or reinstall the version
lvm uninstall <version-id>
lvm install <version-id># Linux/macOS
source ~/.bashrc # or ~/.zshrc
# Windows (PowerShell)
Get-ItemProperty -Path 'HKCU:\Environment' -Name PATH | Format-List# Force a specific backend
lvm install latest --backend cpu # fall back to CPU
# Check available backends for your platform
nvidia-smi # CUDA
vulkaninfo # Vulkan- Version rollback/snapshots
- Backup and restore configurations
- Plugin system for custom backends
- GUI companion application
- Homebrew and Scoop packages
- Fork the repository
- Create a feature branch
- Submit a pull request
See CONTRIBUTING.md for details.
MIT License — see LICENSE for details.
- Based on llama.cpp by ggml-org
- Inspired by tools like
nvm,n,rbenv,asdf