Drop gitlab dynamic pipelines [HMS-9712] by achilleas-k · Pull Request #2359 · osbuild/image-builder

achilleas-k · 2026-05-21T15:11:59Z

This PR simplifies image building and testing in Gitlab CI by removing the dynamic pipeline generation and instead builds and tests all images for a given distribution and architecture on the same runner.

The imgtestlib has been refactored into a module with multiple files for easier navigation, as it was getting too big for a single file.

Some further improvements I'd like to do after this is merged:

Async "touch" for S3 objects.
Async boot tests.
Return errors from build and boot functions. Currently the test functions rely on the sp.run() shell commands failing to fail a build. I'd like to capture those errors instead and handle them gracefully. That way we can generate clean failure messages. Also it would make it possible to continue with other image builds when a build or boot test fails.
Merge vmtest into imgtestlib.

Closes #1703

achilleas-k · 2026-05-21T15:32:26Z

The PR moves the core parts of the test scripts into the imgtestlib module. The boot-image script uses Python's match statement, which isn't available on EL9. This wasn't an issue before because boot-image was only ever run on the CI runners, which are Fedora 42. Now that the core functionality is part of the importable module though, we need to rewrite it to run on older Python versions.

We should be testing builds on EL9 as well, so I should do this regardless.

supakeen · 2026-05-22T05:09:08Z

So; since there are a lot of failures I went through them:

7 jobs succeeded.
20+ jobs got their instance killed.
Jobs fail when testing installers, as they need access to KVM and it isn't available.
A few failures due to: time="2026-05-21T16:30:35Z" level=fatal msg="Error parsing image name \"docker://None\": invalid reference format: repository name must be lowercase"

achilleas-k · 2026-05-24T18:07:00Z

So; since there are a lot of failures I went through them:

Thanks for going through them!!

1. 7 jobs succeeded.

Not great.

2. 20+ jobs got their instance killed.

I suspect this will be the biggest issue with this change.

3. Jobs fail when testing installers, as they need access to KVM and it isn't

Ugh, right, yeah. I guess we're going to need to run everything on KVM-enabled runners since every distro has an installer.

4. A few failures due to: `time="2026-05-21T16:30:35Z" level=fatal msg="Error parsing image name \"docker://None\": invalid reference format: repository name must be lowercase"`

I think I fixed that? Anyway, definitely fixable.

lzap · 2026-05-25T10:51:58Z

Observation: average job time was 1 hour and the slowest one was 4 hours.

2. 20+ jobs got their instance killed.

We must start tracking these, I wonder if we pay actually more than if we were not using spot. Because when a spot instance is killed by AWS for capacity reasons, we still pay the time on the clock. AWS sends a signal 2 minutes before the term/kill so we can mark those jobs for later inspection and statistics.

achilleas-k · 2026-05-27T18:13:42Z

Rebased on main but deleted .gitlab-ci.yml. I want to try a few things before rerunning the pipelines. Setting to draft.

achilleas-k · 2026-05-28T20:21:42Z

I'll rebase this on #2383 and start experimenting with doing some things async

achilleas-k · 2026-06-10T10:21:44Z

Current state captures all output from the build and turns saves it as a job artifact. The boot tests also generate a lot of output though, so we're still reaching the limits of the job log viewer.

Doing boot tests async and saving the output as an artifact as well will help. I'm not sure if doing multiple local (KVM) boot tests in parallel is a good idea though. Our runners would probably get overloaded quickly if we start boot testing an ISO and a couple of qcows at the same time.

Builds all modified images for a specific distro and (host) architecture. This script is essentially the same as the generate-build-config script, only instead of generating a gitlab-ci file with the images that need to be rebuilt, it runs any required builds in sequence. On successful build, it boots the image (if supported, decided by the boot_image() function) and uploads the results.

Builds all modified images that depend on an ostree commit for a specific distro and (host) architecture. The script is essentially the same as the generate-ostree-build-config script, only instead of generating a gitlab-ci file with the images that need to be rebuilt, it runs any required builds in sequence. This is very similar to the test-new-manifests script, but it also handles discovering, downloading, and running ostree containers to serve the payload ostree commits for derived images (ostree disk images and installers).

Update the gitlab-ci.yml generator to run the new tests. Generate the new config.

Let's test everything!

This way, the build progress should look like this: 1/22: Testing image ... <folded> Image build log <folded> Boot test log <folded> Results upload log Test finished!!

Now that we're building all images on the same runner, the log becomes too long and noisy and hits the limits of the CI log length. Capture build log output and errors and store them in a path that will be used as a CI artifact we can review. Note that runcmd() will print stdout and stderr when a command fails.

The log_section's __init__() is called once per instance of the decorator itself, so multiple calls to a decorated method (e.g. build()) uses the same ID. Generate the ID when entering the context instead so that multiple invocations of the same function use different IDs.

These boot tests generate a lot of output and take too long. I want to see what a full run looks like with only cloud boot tests.

achilleas-k requested review from a team and thozza as code owners May 21, 2026 15:12

achilleas-k requested review from lzap and supakeen May 21, 2026 15:12

achilleas-k force-pushed the ci/no-dynamic-pipelines branch 2 times, most recently from f000c79 to 3781ada Compare May 21, 2026 16:08

achilleas-k force-pushed the ci/no-dynamic-pipelines branch from 3781ada to 57fe49f Compare May 27, 2026 18:12

achilleas-k marked this pull request as draft May 27, 2026 18:13

achilleas-k mentioned this pull request May 28, 2026

Break down imgtestlib module and use log sections #2383

Merged

achilleas-k force-pushed the ci/no-dynamic-pipelines branch 6 times, most recently from 7e7e90b to a1edcb7 Compare June 10, 2026 08:23

achilleas-k added 8 commits June 10, 2026 12:46

test: update gitlab-ci.yml generator

e794722

Update the gitlab-ci.yml generator to run the new tests. Generate the new config.

test: delete the dynamic pipeline generators

539f627

Schutzfile: bump the rngseed

1694100

Let's test everything!

test: move prints outside of build_image()

5ae136b

This way, the build progress should look like this: 1/22: Testing image ... <folded> Image build log <folded> Boot test log <folded> Results upload log Test finished!!

gitlab: save build logs to job artifacts

bc95696

achilleas-k added 2 commits June 10, 2026 12:47

WIP: disable local VM boot tests

1e1f557

These boot tests generate a lot of output and take too long. I want to see what a full run looks like with only cloud boot tests.

achilleas-k force-pushed the ci/no-dynamic-pipelines branch from 21d6744 to 1e1f557 Compare June 10, 2026 10:48

achilleas-k changed the title ~~Drop gitlab dynamic pipelines and refactor imgtestlib [HMS-9712]~~ Drop gitlab dynamic pipelines [HMS-9712] Jun 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drop gitlab dynamic pipelines [HMS-9712]#2359

Drop gitlab dynamic pipelines [HMS-9712]#2359
achilleas-k wants to merge 10 commits into
osbuild:mainfrom
achilleas-k:ci/no-dynamic-pipelines

achilleas-k commented May 21, 2026 •

edited

Loading

Uh oh!

achilleas-k commented May 21, 2026

Uh oh!

supakeen commented May 22, 2026

Uh oh!

achilleas-k commented May 24, 2026

Uh oh!

lzap commented May 25, 2026

Uh oh!

achilleas-k commented May 27, 2026

Uh oh!

achilleas-k commented May 28, 2026

Uh oh!

achilleas-k commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

achilleas-k commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

achilleas-k commented May 21, 2026

Uh oh!

supakeen commented May 22, 2026

Uh oh!

achilleas-k commented May 24, 2026

Uh oh!

lzap commented May 25, 2026

Uh oh!

achilleas-k commented May 27, 2026

Uh oh!

achilleas-k commented May 28, 2026

Uh oh!

achilleas-k commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

achilleas-k commented May 21, 2026 •

edited

Loading