release: v2.1.0 — native snapshot-fork, warm-pool CoW fill, prune#28
Open
ZhiXiao-Lin wants to merge 22 commits into
Open
release: v2.1.0 — native snapshot-fork, warm-pool CoW fill, prune#28ZhiXiao-Lin wants to merge 22 commits into
ZhiXiao-Lin wants to merge 22 commits into
Conversation
added 22 commits
June 12, 2026 15:17
Code survey of A3S-Lab/libkrun (vendored checkout): vCPU/VM state save+restore already exists as dead code; pause/resume half-wired; TEE guest_memfd is an in-tree precedent for non-anonymous RAM; device serialization missing — but our use case snapshots an IDLE deferred-main template (quiesced queues), shrinking device state to queue registration + the virtio-fs inode map (the one hard item). Native path chosen over a Firecracker/CH second backend (no virtio-fs in FC; no adapter layer; single VMM). 4-phase plan, Phase A = RAM+CPU snapshot/restore with MAP_PRIVATE CoW, go/no-go in ~2 weeks.
The shim is spawned by the controller and doesn't inherit a3s-box's env, so KRUN_SNAPSHOT_MEM_FILE / KRUN_SNAPSHOT_SOCK never reached libkrun. Forward them verbatim (like A3S_BOX_KSM). Experimental Phase-A plumbing; per-box paths come in Phase C.
…boots A restored guest resumes already-booted, so its exec server never re-signals readiness — the cold-boot wait_for_exec_ready loop would stall box registration on its 120s safety cap, leaving the VM alive but unregistered (a3s-box exec → 'No such box'). On restore (KRUN_RESTORE_FROM set): wait_for_vm_running + a single best-effort exec probe (probe_exec_ready_once, non-blocking) so the box registers immediately and exec/attach connect on demand. Also gate the deferred-main auto-spawn off restore mode — the restored guest's main is already running; re-spawning would duplicate it.
… 40ms) wait_for_vm_running always looped to its fixed 250ms cap — it's a crash-detection grace, not a readiness wait (the VM is alive the instant the shim spawns). A snapshot-restored VM reaches its run loop in ~20ms, so 40ms catches an immediate restore failure while saving ~210ms/fork on the fast path. Cold boot keeps 250ms.
prepare_layout did a registry pull/resolution (~100ms network round-trip, even on a cache hit) on every restore. A snapshot-fork reuses the already-cached rootfs, so on restore compute the cache key + use the cached path directly, skipping the pull and the guest-init refresh. Verified on KVM: registered ~230ms->~109ms, exec still works. Cumulative with the readiness fix: ~450ms->~109ms registered (~4x).
…t lost-update + O(N²) bottleneck) run registered via load_default() (outside lock) + state.add() (save the stale in-memory snapshot under the lock) — a lost-update race where concurrent fork registrations clobbered each other (a burst of N left only a fraction registered). Switch to the atomic StateFile::add_record (load-fresh-under-lock → push → write), and make that append load WITHOUT the reconcile sweep (a PID-liveness + cleanup pass over every other box) — appending one box must not be O(N) under the global lock, which serialized a high-concurrency fork burst into O(N²) syscalls. Reconcile still runs on list/status loads and in the monitor.
…xConfig) Restore was detected only from the process-global KRUN_RESTORE_FROM env, which a single process driving MANY VMs (the warm pool / a future fork daemon) cannot express. Add snapshot_mem_file/snapshot_sock/restore_from to BoxConfig + InstanceSpec; build_instance_spec sources them from config (env fallback for single-VM run); controller.rs sets the shim env per-VM from the spec (precedence over global env); is_restore_mode(config) checks the per-VM config OR env. Single-VM run behavior unchanged (env still works). Foundation for pool restore-fill + fork daemon.
…e later rollback paths The earlier registration-race fix removed the local `state` but left two rollback calls (volume-attach, log-dir) still referencing it, so the CLI didn't compile (tests had been running a stale binary). The record is registered atomically via add_record, so these paths now un-register via StateFile::remove_record and roll back with state=None.
… (opt-in) WarmPool.boot_new_vm + the background replenish loop both cold-booted every VM (VmManager::new + boot), bypassing the snapshot infra — so the pool that exists to make VMs fast paid a full ~1.7s cold boot per slot. Add PoolConfig.snapshot_fork (opt-in): boot ONE template VM with file-backed RAM, trigger its snapshot once, then restore every other slot (MAP_PRIVATE CoW, ~tens of ms) via the per-VM restore config seam. All same-image pool VMs share one RAM image. Default off (no behavior change).
…oinSet The maintenance loop booted the `needed` slots one await at a time. Spawn them concurrently: each boot/restore overlaps its readiness wait, so a batch fills in ~one boot's time instead of N×. For snapshot-fork, ensure_template's lock serializes the one-time template build while the rest wait, then restore in parallel.
…tainer prune) The a3s-box-test.md report's only still-unfixed bug: no box-only prune (only system-prune which also nukes images, and image-prune). Add 'a3s-box prune [--force]' (visible alias 'container-prune') that removes created/stopped/dead boxes, keeping running/paused. Also add the pre-existing-missing 'import' + new 'prune' to the command-coverage test list.
… running Erroring out when no daemon is up is wrong for a status query (and broke the local-state smoke test); report 'No pool daemon running' and exit 0, like ps with no boxes.
…lement 0
The two command_coverage smoke tests indexed the inspect JSON directly (e.g.
inspect["Reference"]) but inspect/image-inspect return [{...}] (Docker-compatible), so
the lookups were Null. Take element 0 first. Product behavior is correct; the tests
were stale.
Carries the snapshot-restore fixes that make CoW fork correct: virtio-fs inode-map persist/restore (fixes exec EBADF on a restored guest), virtio queue-index reconciliation against guest-RAM used_idx, and live vsock muxer queue capture. Required for the v2.1.0 snapshot-fork path.
- Native snapshot-fork (Copy-on-Write microVM cloning): snapshot a booted template (file-backed guest RAM + KVM/virtio state), restore many forks via MAP_PRIVATE. ~4x faster per fork than cold boot; 100 forks < ~1s on /dev/kvm. - Warm pool 'pool start --snapshot-fork' fill + parallel JoinSet replenish. - 'prune' command (Docker container prune): remove all created/stopped/dead boxes. - Per-VM snapshot/restore config seam (BoxConfig/InstanceSpec). - Fixes: atomic concurrent box registration, 'pool status' graceful with no daemon, faster OCI-free restore readiness. Bumps workspace 2.0.7 -> 2.1.0 and refreshes README + CHANGELOG.
Arch-gate VcpuEvent::SaveState to x86_64 so the vmm compiles on the linux-arm64 (aarch64) release target. Fixes the v2.1.0 release build failure (E0425: cannot find type VcpuState on aarch64).
warm_pool::trigger_snapshot connects to libkrun's snapshot trigger socket via tokio::net::UnixStream, which does not exist on Windows (E0433). Snapshot-fork is a Linux/KVM feature, so gate the real impl to #[cfg(unix)] and add a #[cfg(not(unix))] stub that errors — mirroring the existing cfg(not(windows)) pattern in the pool CLI. Fixes the v2.1.0 Build Windows (WHPX) failure.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
v2.1.0
Native Copy-on-Write snapshot-fork for a3s-box plus supporting features and fixes. Bumps the workspace
2.0.7 → 2.1.0.Added
MAP_PRIVATE. Verified on/dev/kvm: ~4× faster per fork than a cold boot (~450 ms → ~110 ms), 100 forks in under ~1 s (~8 ms amortized each, ~13 MB RSS),execruns real commands over virtio-fs in the restored guest. Driven byKRUN_SNAPSHOT_MEM_FILE/KRUN_SNAPSHOT_SOCK/KRUN_RESTORE_FROMor per-VMBoxConfig/InstanceSpec.pool start --snapshot-forkfill (one template, CoW-restore the rest) + parallel JoinSet replenish: fill-to-8 ~12.4 s → ~1.9 s.prunecommand (aliascontainer-prune): remove all created/stopped/dead boxes (Docker container prune).snapshot_mem_file/snapshot_sock/restore_from).Fixed
pool statusexits successfully when no daemon is running.Submodule
8bb409b(snapshot-restore: virtio-fs inode-map persist/restore, queue-index reconciliation, live vsock muxer queue capture) onA3S-Lab/libkrun@feat/snapshot-restore.Docs
apps/docsbox docs (en+cn) updated in the monorepo.Tests green on the branch: core 403, runtime 907, cli lib 566, command_coverage 6. Snapshot-fork path KVM-verified.
Tag
v2.1.0after merge triggers release.yml (crates.io + Homebrew + winget + GitHub Release).