Skip to content

Add workshop user to netdev and render groups for other bases#833

Open
jonathan-conder wants to merge 1 commit into
mainfrom
fix/render-gids
Open

Add workshop user to netdev and render groups for other bases#833
jonathan-conder wants to merge 1 commit into
mainfrom
fix/render-gids

Conversation

@jonathan-conder

@jonathan-conder jonathan-conder commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Description

Different bases can have different IDs for a named group. This affects devices in the render group, for example GPU devices are given the clock group in a 26.04 container on a 24.04 host.

This PR works around the issue by adding the workshop user to all variants of the render group on supported hosts (Ubuntu 20.04 through 26.04). We can add more numbers as necessary for other distributions.

To keep cloud-config the same for all bases, we list the groups by ID and set create_groups: false. This means cloud-init passes the group IDs directly to useradd without trying to overwrite existing groups.

Since useradd requires the groups to exist, so we need to create them ourselves. I don't think cloud-init provides a way to specify the ID of a new group, so we manually run groupadd as part of a bootcmd and ignore the "already exists" error codes.

Compatibility

The GID changes will be picked up when users refresh their workshops, since I bumped the snapshot format revision. This is the first time we've done this, so I verified that old workshops continue to function after refreshing the snap (namely launch --continue and restore).

At some point we can drop support for format revision 1, maybe after adding a warning. We should probably also make relevant tasks more robust to daemon downgrades (e.g. refuse to launch a workshop with format=3).

Futureproofing

This is intended to be a temporary workaround, but it might take some time to fix properly. So here's a script that generates the cloud-config. It's AI-generated but I think it's pretty solid:

import subprocess
import sys
import yaml

GROUPS = ["adm", "cdrom", "sudo", "dip", "plugdev", "audio", "netdev", "lxd", "video", "render"]
VERSIONS = ["20.04", "22.04", "24.04", "26.04"]

# Groups related to host resources, peripheral devices, virtual interfaces, or tunnels passed to containers:
# - render/video/audio (GPUs/media)
# - cdrom (optical drives)
# - plugdev (hotplugged USB/HID devices)
# - netdev/dip (virtual network interfaces, tunnels like TUN/TAP, or dialup/cellular modems)
# purely administrative or system auth gates (adm, sudo, lxd) are excluded as GID mismatches do not influence device/resource permissions.
EXCLUDED_COMPAT_GROUPS = ["adm", "sudo", "lxd"]

def get_group_gid(container, group):
    try:
        # Run getent group <group> inside the container
        res = subprocess.run(
            ["lxc", "exec", container, "--", "getent", "group", group],
            capture_output=True,
            text=True,
            check=True
        )
        # Parse output e.g. "adm:x:4:syslog,ubuntu" -> GID is the 3rd field
        parts = res.stdout.strip().split(":")
        if len(parts) >= 3:
            return int(parts[2])
    except subprocess.CalledProcessError:
        pass
    return None

def main():
    print("Launching temporary LXD containers to inspect group mappings...", file=sys.stderr)
    
    # Launch containers in parallel
    for v in VERSIONS:
        name = f"inspect-u{v.replace('.', '')}"
        subprocess.run(["lxc", "launch", f"ubuntu:{v}", name, "--ephemeral"], check=True, capture_output=True)

    print("Retrieving GID mappings from containers...", file=sys.stderr)
    
    # Gather GIDs: {group: {version: gid}}
    mappings = {g: {} for g in GROUPS}
    for v in VERSIONS:
        name = f"inspect-u{v.replace('.', '')}"
        for g in GROUPS:
            gid = get_group_gid(name, g)
            mappings[g][v] = gid

    # Clean up containers
    print("Cleaning up temporary containers...", file=sys.stderr)
    for v in VERSIONS:
        name = f"inspect-u{v.replace('.', '')}"
        subprocess.run(["lxc", "delete", "-f", name], check=True, capture_output=True)

    # Figure out which non-excluded groups have varying GIDs across releases
    # and map GID back to the versions they belong to: {group: {gid: [versions]}}
    group_gid_versions = {}
    for g in GROUPS:
        if g in EXCLUDED_COMPAT_GROUPS:
            continue
            
        g_map = {}
        for v in VERSIONS:
            gid = mappings[g][v]
            if gid is not None:
                g_map.setdefault(gid, []).append(v)
        
        unique_gids = sorted(list(g_map.keys()))
        if len(unique_gids) > 1:
            group_gid_versions[g] = g_map

    # 1. Generate multi-line bootcmd script
    bootcmd_lines = [
        "set -e",
        "maybe_groupadd() {",
        "    # Ignore GID not unique (exit code 4) or group name not unique (exit code 9)",
        "    groupadd -g \"$1\" -r \"$2\" || case $? in 4|9) ;; *) return $? ;; esac",
        "}",
        "maybe_groupadd 1000 workshop"
    ]
    
    for group in sorted(group_gid_versions.keys()):
        for gid in sorted(group_gid_versions[group].keys()):
            compat_name = f"{group}-compat-{gid}"
            bootcmd_lines.append(f"maybe_groupadd {gid} {compat_name}")

    # 2. Generate user group membership list with comments
    yaml_lines = [
        "#cloud-config",
        "bootcmd:",
        "- |"
    ]
    for line in bootcmd_lines:
        yaml_lines.append(f"  {line}")
        
    yaml_lines.append("users:")
    yaml_lines.append("- default")
    yaml_lines.append("- name: workshop")
    yaml_lines.append("  uid: 1000")
    yaml_lines.append("  primary_group: workshop")
    yaml_lines.append("  sudo: ALL=(ALL) NOPASSWD:ALL")
    yaml_lines.append("  create_groups: false")
    yaml_lines.append("  groups:")
    yaml_lines.append("  # Standard system groups")
    for group in GROUPS:
        yaml_lines.append(f"  - '{group}'")
        
    yaml_lines.append("  # Compatibility GIDs for various host systems:")
    for group in sorted(group_gid_versions.keys()):
        for gid in sorted(group_gid_versions[group].keys()):
            versions_str = ", ".join(group_gid_versions[group][gid])
            yaml_lines.append(f"  - '{gid}' # {group} on {versions_str}")
        
    yaml_lines.append("  shell: /bin/bash")

    # Print markdown table
    print("# Group to GID Mappings Table\n")
    headers = ["Group"] + VERSIONS
    print("| " + " | ".join(headers) + " |")
    print("|" + "|".join(["---" for _ in range(len(headers))]) + "|")
    for g in GROUPS:
        row = [g] + [str(mappings[g][v] if mappings[g][v] is not None else "N/A") for v in VERSIONS]
        print("| " + " | ".join(row) + " |")
    print("\n")

    print("# Generated cloud-init config\n")
    print("```yaml")
    print("\n".join(yaml_lines))
    print("```")

if __name__ == "__main__":
    main()

Self-review quick check

  • Make decisions that cost a lot to reverse explicit in the PR description.
  • Avoid nested conditions.
  • Delete dead code and redundant comments.
  • Normalise symmetries by sticking to doing identical things identically.
// one way to handle errors
if err := f(); err != nil {
   ...
}

// one way to handle multiple returns
val, err := f()
if err != nil {
   ...
}
...
  • Check that coupled code elements, files, and directories are adjacent. For example, test data is stored as close as possible to a test.
  • Put variable declaration and initialisation together.
  • Divide large expressions into digestable and self-explanatory ones. Use multiple variables if required.
  • Put a blank line between two logically different chunks of code.
  • Follow the style guide for new error messages.

Docs

Procedure:

  • I have checked and added or updated relevant documentation.
  • I have checked and added or updated relevant release notes.
  • I have included the technical author in the review.

Content:

  • Headings and titles accurately describe the content.
  • New and updated pages include correct metadata.
  • Documentation tests are added or updated where applicable (for tutorial/ and how-to/ sections).
  • Documentation follows the style guide.
  • If needed, docs/.coverage.yaml updated, coverage tags added (.. artefact).

Or:

  • I confirm the PR has no implications for documentation.

@jonathan-conder jonathan-conder self-assigned this Jun 8, 2026
@jonathan-conder jonathan-conder force-pushed the fix/render-gids branch 2 times, most recently from f83065b to bcfbdd6 Compare June 8, 2026 08:20
@jonathan-conder jonathan-conder marked this pull request as ready for review June 9, 2026 01:07
@jonathan-conder jonathan-conder requested a review from Copilot June 9, 2026 01:08

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates Workshop’s LXD backend cloud-init user-data to handle group-GID mismatches across Ubuntu bases/hosts (notably render and netdev) by adding the workshop user to multiple compatibility GIDs, and bumps the snapshot format revision accordingly so existing workshops can be refreshed consistently.

Changes:

  • Extend the generated cloud-init config to include create_groups: false, a YAML groups: list including compatibility GIDs, and a bootcmd that creates the required compat groups.
  • Bump the snapshot format revision to force rebuild/refresh behavior for the new on-disk format.
  • Update the integration snapshot-format test expectations to the new user.user-data hash.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
internal/workshop/lxd/lxd_backend.go Reworks cloud-init group membership injection to include compat GIDs and group creation via bootcmd.
internal/workshop/lxd/lxd_backend_snapshots.go Bumps snapshot format revision from 1 → 2.
internal/workshop/lxd/tests/integration/snapshot-format.yaml Updates expected snapshot config hashes to match the new cloud-init user-data.

Comment thread internal/workshop/lxd/lxd_backend.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants