Drop gitlab dynamic pipelines [HMS-9712]#2359
Conversation
|
The PR moves the core parts of the test scripts into the imgtestlib module. The We should be testing builds on EL9 as well, so I should do this regardless. |
f000c79 to
3781ada
Compare
|
So; since there are a lot of failures I went through them:
|
Thanks for going through them!!
Not great.
I suspect this will be the biggest issue with this change.
Ugh, right, yeah. I guess we're going to need to run everything on KVM-enabled runners since every distro has an installer.
I think I fixed that? Anyway, definitely fixable. |
|
Observation: average job time was 1 hour and the slowest one was 4 hours.
We must start tracking these, I wonder if we pay actually more than if we were not using spot. Because when a spot instance is killed by AWS for capacity reasons, we still pay the time on the clock. AWS sends a signal 2 minutes before the term/kill so we can mark those jobs for later inspection and statistics. |
3781ada to
57fe49f
Compare
|
Rebased on |
|
I'll rebase this on #2383 and start experimenting with doing some things async |
7e7e90b to
a1edcb7
Compare
|
Current state captures all output from the build and turns saves it as a job artifact. The boot tests also generate a lot of output though, so we're still reaching the limits of the job log viewer. Doing boot tests async and saving the output as an artifact as well will help. I'm not sure if doing multiple local (KVM) boot tests in parallel is a good idea though. Our runners would probably get overloaded quickly if we start boot testing an ISO and a couple of qcows at the same time. |
Builds all modified images for a specific distro and (host) architecture. This script is essentially the same as the generate-build-config script, only instead of generating a gitlab-ci file with the images that need to be rebuilt, it runs any required builds in sequence. On successful build, it boots the image (if supported, decided by the boot_image() function) and uploads the results.
Builds all modified images that depend on an ostree commit for a specific distro and (host) architecture. The script is essentially the same as the generate-ostree-build-config script, only instead of generating a gitlab-ci file with the images that need to be rebuilt, it runs any required builds in sequence. This is very similar to the test-new-manifests script, but it also handles discovering, downloading, and running ostree containers to serve the payload ostree commits for derived images (ostree disk images and installers).
Update the gitlab-ci.yml generator to run the new tests. Generate the new config.
Let's test everything!
This way, the build progress should look like this: 1/22: Testing image ... <folded> Image build log <folded> Boot test log <folded> Results upload log Test finished!!
Now that we're building all images on the same runner, the log becomes too long and noisy and hits the limits of the CI log length. Capture build log output and errors and store them in a path that will be used as a CI artifact we can review. Note that runcmd() will print stdout and stderr when a command fails.
The log_section's __init__() is called once per instance of the decorator itself, so multiple calls to a decorated method (e.g. build()) uses the same ID. Generate the ID when entering the context instead so that multiple invocations of the same function use different IDs.
These boot tests generate a lot of output and take too long. I want to see what a full run looks like with only cloud boot tests.
21d6744 to
1e1f557
Compare
This PR simplifies image building and testing in Gitlab CI by removing the dynamic pipeline generation and instead builds and tests all images for a given distribution and architecture on the same runner.
The imgtestlib has been refactored into a module with multiple files for easier navigation, as it was getting too big for a single file.
Some further improvements I'd like to do after this is merged:
sp.run()shell commands failing to fail a build. I'd like to capture those errors instead and handle them gracefully. That way we can generate clean failure messages. Also it would make it possible to continue with other image builds when a build or boot test fails.Closes #1703