Description
Description
On a Docker-driver GPU host (NVIDIA GPU auto-detected), `nemoclaw onboard` cannot bring up a GPU-enabled sandbox. While creating the sandbox, onboard enables GPU passthrough — this is the standard create-then-GPU-enable path that runs on a normal FIRST onboard whenever a GPU is present on a Docker-driver gateway (gated by NEMOCLAW_DOCKER_GPU_PATCH); The OpenShell supervisor never reconnects to the GPU-enabled container, so the sandbox enters Error phase before the GPU proof can run, the step aborts with exit 1, and onboard fails.
Reproduced on Ubuntu 24.04 (RTX PRO 6000 Blackwell) and Ubuntu 26.04 (RTX A6000) in v0.0.60.
Environment
Device: GPU CI runners — Ubuntu 24.04 (NVIDIA RTX PRO 6000 Blackwell Server Edition, 97887 MB) and Ubuntu 26.04 (NVIDIA RTX A6000, 46068 MB)
OS: Ubuntu 24.04 / Ubuntu 26.04
Architecture: x86_64
Node.js: v22.22.2
npm: 10.9.7
Docker: docker (Docker-driver gateway; Docker CDI GPU support detected, /etc/cdi/nvidia.yaml)
OpenShell CLI: 0.0.44
NemoClaw: v0.0.60
Steps to Reproduce
1. On a Docker-driver Linux GPU host (Ubuntu 24.04 or 26.04) with an NVIDIA GPU and Docker CDI GPU support, with no existing NemoClaw sandbox.
2. Run a normal first onboard: nemoclaw onboard (GPU auto-detected; OpenShell GPU passthrough enabled by default).
3. Onboard creates the sandbox and then enables GPU access on the Docker container (the GPU-enable step).
4. Observe the GPU-enable step result and the sandbox phase.
Expected Result
The sandbox is created with GPU access, the OpenShell supervisor reconnects to the GPU-enabled container, the GPU proof runs, the sandbox reaches Ready, and the first onboard completes.
Actual Result
The GPU-enable step fails and onboard aborts (exit 1). Product log:
Docker-driver GPU patch active; creating sandbox first, then recreating the Docker container with GPU access.
...
patched_create_option=--gpus all
Docker GPU patch failed.
OpenShell supervisor did not reconnect to the GPU-enabled container; pre-patch sandbox restored.
OpenShell sandbox entered Error phase before the GPU proof could run.
sandbox_phase=Error
Diagnostics saved: /var/lib/gitlab-runner/.nemoclaw/onboard-failures/-my-assistant-docker-gpu-patch
Escape hatch: set NEMOCLAW_DOCKER_GPU_PATCH=0 to skip this patch.
Bug Details
| Field |
Value |
| Priority |
Unprioritized |
| Action |
Dev - Open - To fix |
| Disposition |
Open issue |
| Module |
Machine Learning - NemoClaw |
| Keyword |
NemoClaw, NEMOCLAW_GH_SYNC_APPROVAL, NemoClaw_Onboard, NemoClaw_Sandbox, NemoClaw-SWQA-RelBlckr-Recommended |
[NVB#6281494]
Description
Description
Steps to Reproduce Expected Result Actual ResultBug Details
[NVB#6281494]