lvm — llama.cpp Version Manager

A cross-platform CLI tool for managing multiple llama.cpp versions on your machine.

# Install latest stable version
lvm install latest

# Switch to a specific version
lvm use b3412-cuda

# Interactive version picker
lvm use          # arrow-key selection
lvm install      # arrow-key selection

# List all installed versions
lvm ls

What is lvm?

lvm is a lightweight version manager that simplifies working with multiple builds of llama.cpp. It handles:

Installation of llama.cpp releases from GitHub
Version switching between different builds (CPU, CUDA, Metal, Vulkan, etc.)
Channel management between stable and beta releases
Automatic shims for easy command invocation
Cross-platform support (Windows, Linux, macOS)

Think of it like nvm (Node Version Manager) but for llama.cpp.

Features

Feature	Description
Multiple versions	Install and switch between any llama.cpp release
GPU backends	Support for CPU, CUDA, Metal, Vulkan, ROCm, OpenVINO, SYCL
Stable & Beta	Separate channels for production and bleeding-edge builds
Interactive picker	Arrow-key selection UI for `use`, `install`, and `uninstall`
One-command init	Automatic PATH configuration, zero manual setup
Auto-shims	All llama.cpp binaries become accessible via simple commands
Cross-platform	Works on Windows, Linux, and macOS
Cache	GitHub releases are cached for 6 hours to avoid repeated API calls
Refresh	Run `lvm fetch` to manually refresh cached data before TTL expires
Clean uninstall	Remove versions without leaving artifacts

Installation

Official Installer (Recommended)

# Linux/macOS
curl -sSL https://github.com/asertym/lvm/releases/latest/download/install.sh | sh

# Windows (PowerShell)
# Download from https://github.com/YOURNAME/lvm/releases and run install.ps1

Manual Installation

# Download the binary for your platform
wget https://github.com/asertym/lvm/releases/latest/download/lvm-linux-amd64

# Move to a location in your PATH
sudo mv lvm-linux-amd64 /usr/local/bin/lvm

# Initialize
lvm init

Building from Source

git clone https://github.com/asertym/lvm.git
cd lvm
go build -o lvm .
sudo mv lvm /usr/local/bin/lvm
lvm init

Quick Start

# 1. Initialize (run once)
lvm init

# 2. Install a version (interactive picker by default)
lvm install

# 3. Start using llama.cpp commands
llama-cli --help
llama-quantize model.gguf q4_0.gguf

Usage

Install a Version

# Latest stable release
lvm install latest

# Latest beta/pre-release
lvm install latest-beta

# Specific build number
lvm install b3412

# With explicit GPU backend
lvm install latest --backend cuda
lvm install b3412 --backend vulkan
lvm install latest --backend metal

# Interactive picker (default when no version given)
lvm install        # picks from releases list
lvm install -i     # same, explicit flag
lvm install latest # non-interactive

Available backends:

cpu — CPU-only build
cuda — NVIDIA CUDA GPU acceleration
metal — Apple Metal (macOS)
vulkan — Vulkan API
rocm — AMD ROCm
openvino — Intel OpenVINO
sycl-fp16 / sycl-fp16 — AMD SYCL

Switch Versions

# Switch to a specific installed version
lvm use b3412-cuda

# Interactive picker (default when no version given)
lvm use            # picks from installed list
lvm use -i         # same, explicit flag
lvm use b3412-cuda # non-interactive

# Switch to the stable channel (uses last stable version)
lvm channel stable

# Switch to the beta channel (uses last beta version)
lvm channel beta

List Versions

# List all locally installed versions
lvm ls

# List available releases on GitHub
lvm ls-remote

# Show current active version
lvm current

Fetch / Refresh Cache

# Manually refresh cached GitHub release data
lvm fetch

By default, releases are cached for 6 hours. Use lvm fetch to force a refresh before the cache expires.

Update

# Check for updates to the active version
lvm update

Uninstall

# Remove a specific version
lvm uninstall b3412-cuda

# Interactive picker (default when no version given)
lvm uninstall      # picks from installed list
lvm uninstall -i   # same, explicit flag
lvm uninstall b3412-cuda # non-interactive

Version Information

# Show lvm version
lvm version

# Show currently active version details
lvm current

Examples

Example 1: Setting Up a New Machine

# Clone the repo and build
git clone https://github.com/asertym/lvm.git
cd lvm
go build -o lvm .
sudo mv lvm /usr/local/bin/lvm

# Initialize and install
lvm init
lvm install latest

# Verify
lvm current
llama-cli --version

Example 2: Trying Different GPU Backends

# Try CUDA (if available)
lvm install latest --backend cuda
lvm use latest-cuda

# Fall back to Vulkan if CUDA fails
lvm uninstall latest-cuda
lvm install latest --backend vulkan
lvm use latest-vulkan

# CPU fallback
lvm uninstall latest-vulkan
lvm install latest --backend cpu
lvm use latest-cpu

Example 3: Using Stable and Beta Channels

# Install and use stable (default)
lvm install latest
lvm use latest-cpu

# Later, try beta features
lvm install latest-beta
lvm channel beta

# Back to stable when ready
lvm channel stable

Example 4: Managing Multiple Projects

# Project A uses older stable version
lvm use b3200-cuda

# Project B needs latest features
lvm use b3412-cuda

# Project C needs specific build
lvm install b3150
lvm use b3150-cpu

Example 5: Interactive Mode

# Pick an installed version with arrow keys
lvm use

# Browse all releases and install one
lvm install

# Remove a version (active version is protected from removal)
lvm uninstall

Directory Structure

~/.lvm/
├── active              # Currently active version ID (e.g., "b3412-cuda")
├── channels.json       # Channel state (stable/beta → version IDs)
├── cache/              # Cached GitHub release data (6-hour TTL)
│   └── releases_cache.json
├── shims/              # Auto-generated shell scripts
│   ├── llama-cli
│   ├── llama-server
│   ├── llama-bench
│   ├── llama-quantize
│   └── ...
└── versions/           # Installed llama.cpp versions
    ├── b3412-cuda/
    │   ├── llama-cli
    │   ├── llama-server
    │   ├── ...
    │   └── manifest.json
    └── b3200-cpu/
        ├── main
        ├── ...
        └── manifest.json

How It Works

Version IDs

Versions are identified by unique IDs combining the build tag and backend:

b3412-cuda   # Build 3412 with CUDA backend
b3200-cpu    # Build 3200 with CPU backend
b3150-metal  # Build 3150 with Metal backend

Shims

lvm creates shell script wrappers (shims) for each llama.cpp binary:

# On Unix-like systems
llama-cli → ~/.lvm/shims/llama-cli
          → checks ~/.lvm/active
          → executes ~/.lvm/versions/<active>/llama-cli

# On Windows
llama-cli.cmd → %LVM_HOME%\shims\llama-cli.cmd
              → checks %LVM_HOME%\active
              → executes %LVM_HOME%\versions\<active>\llama-cli.exe

Interactive Picker

When called without a version argument (or with -i), install, use, and uninstall use a TUI picker built with charmbracelet/huh:

Arrow keys to navigate
Enter to confirm
Ctrl+C to abort cleanly
Installed versions and beta releases are visually marked
Active version cannot be removed in uninstall

The picker also adapts to non-TTY contexts (piped input, CI) by switching to accessible keyboard-only mode.

Channel State

Two channels track the "default" version for each track:

{
	"stable": "b3412-cuda",
	"beta": "b3500-cpu"
}

Configuration

Environment Variables

Variable	Description
`LVM_HOME`	Override default `~/.lvm` location

Custom Install Location

export LVM_HOME=/opt/lvm
lvm init

Windows PATH

On Windows, lvm init automatically adds the shims directory to your user PATH via the Registry, ensuring it survives terminal restarts.

Troubleshooting

"no active version set"

# Solution: Install a version first
lvm install latest
lvm use latest-cpu

"binary not found"

# Check active version
lvm current

# Re-initialize shims
rm ~/.lvm/shims/*
lvm init

# Or reinstall the version
lvm uninstall <version-id>
lvm install <version-id>

PATH not working

# Linux/macOS
source ~/.bashrc    # or ~/.zshrc

# Windows (PowerShell)
Get-ItemProperty -Path 'HKCU:\Environment' -Name PATH | Format-List

GPU backend not detected

# Force a specific backend
lvm install latest --backend cpu   # fall back to CPU

# Check available backends for your platform
nvidia-smi       # CUDA
vulkaninfo       # Vulkan

Roadmap

Version rollback/snapshots
Backup and restore configurations
Plugin system for custom backends
GUI companion application
Homebrew and Scoop packages

Contributing

Fork the repository
Create a feature branch
Submit a pull request

See CONTRIBUTING.md for details.

License

MIT License — see LICENSE for details.

Credits

Based on llama.cpp by ggml-org
Inspired by tools like nvm, n, rbenv, asdf

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
internal		internal
.gitignore		.gitignore
GUIDELINES.md		GUIDELINES.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
build.ps1		build.ps1
cmd_install.go		cmd_install.go
cmd_others.go		cmd_others.go
go.mod		go.mod
go.sum		go.sum
install.ps1		install.ps1
install.sh		install.sh
main.go		main.go

Folders and files

Latest commit

History

Repository files navigation

lvm — llama.cpp Version Manager

What is lvm?

Features

Installation

Official Installer (Recommended)

Manual Installation

Building from Source

Quick Start

Usage

Install a Version

Switch Versions

List Versions

Fetch / Refresh Cache

Update

Uninstall

Version Information

Examples

Example 1: Setting Up a New Machine

Example 2: Trying Different GPU Backends

Example 3: Using Stable and Beta Channels

Example 4: Managing Multiple Projects

Example 5: Interactive Mode

Directory Structure

How It Works

Version IDs

Shims

Interactive Picker

Channel State

Configuration

Environment Variables

Custom Install Location

Windows PATH

Troubleshooting

"no active version set"

"binary not found"

PATH not working

GPU backend not detected

Roadmap

Contributing

License

Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages