Skip to content

[ML] Automate minor version bump in CI pipeline#3064

Open
edsavage wants to merge 14 commits into
elastic:mainfrom
edsavage:feature/minor-version-bump
Open

[ML] Automate minor version bump in CI pipeline#3064
edsavage wants to merge 14 commits into
elastic:mainfrom
edsavage:feature/minor-version-bump

Conversation

@edsavage

@edsavage edsavage commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Supersedes #3063 (same commits; reopened from personal fork per CONTRIBUTING.md).

Summary

Extends the ml-cpp-version-bump Buildkite pipeline with WORKFLOW=minor for feature-freeze (minor version) releases, complementing the existing patch workflow (#3030).

Phase 1 (validate): checks that main is at NEW_VERSION (e.g. 9.5.0), derives MAIN_NEW_VERSION (e.g. 9.6.0), and sets meta-data for which legs still need work.

Phase 2 (minor freeze):

  • Leg A: create release branch BRANCH from main (direct ref push)
  • Leg B: bump main to MAIN_NEW_VERSION, update .backportrc.json, open PR
  • Slack notification (step notify)
  • DRA artifact wait (production branches only)

Also includes testing-MAJOR.MINOR sandbox branch support for manual Buildkite testing only — version rules strip the testing- prefix; git operations use the full ref name; main bump and DRA wait are skipped so sandbox runs do not touch production branches.

Test plan

Automated (local / PR CI)

  • ./dev-tools/run_dev_tools_tests.sh — validation rules, pipeline JSON, phase-2 upload, DRA gating (48 tests at last run)
  • ml-cpp-pr-builds green on this PR

Buildkite — production parameters (validate only)

Recommended before merge; does not create branches or open PRs when DRY_RUN=true:

WORKFLOW=minor
BRANCH=9.5
NEW_VERSION=9.5.0
DRY_RUN=true

Branch: edsavage:feature/minor-version-bump

Buildkite — sandbox E2E (orchestration)

Proven on build #54 (passed):

WORKFLOW=minor
BRANCH=testing-9.5
NEW_VERSION=9.5.0
VERSION_BUMP_NO_MERGE=true

Exercises: validate → create testing-9.5 @ 9.5.0 → skip main bump → Slack notify → DRA skip.

testing-* is intentional for pipeline regression testing and inert when release-eng uses production BRANCH=9.5.

Buildkite — idempotency (sandbox)

With testing-9.5 already present after a successful run:

WORKFLOW=minor
BRANCH=testing-9.5
NEW_VERSION=9.5.0
DRY_RUN=true

Expect: validate passes, ml_cpp_version_bump_noop=true, phase 2 not scheduled.

Buildkite — patch workflow regression

Confirm minor changes did not break patch bumps:

WORKFLOW=patch
BRANCH=<existing release branch>
NEW_VERSION=<next patch>
DRY_RUN=true

Intentionally not tested pre-merge

  • Real production freeze (BRANCH=9.5, no testing- prefix, no DRY_RUN) — Leg B main bump PR + .backportrc.json merge; to be run by release-eng when ready
  • DRA wait on production branches — long-running; sandbox correctly skips for testing-*
  • Auto-merge (VERSION_BUMP_MERGE_AUTO=true) — defer to release-eng staging

Note: DRY_RUN=true exits before phase-2 upload, so it validates parameters only and does not exercise Leg A/B scripts.

Release coordination

Release-eng parameters for a real freeze (when main is at 9.5.0):

WORKFLOW=minor
BRANCH=9.5
NEW_VERSION=9.5.0

Made with Cursor

edsavage and others added 9 commits July 2, 2026 12:52
Extend the ml-cpp-version-bump pipeline for WORKFLOW=minor (feature freeze):
create the release branch from main via direct ref push, bump main to the
derived next minor via PR (gradle.properties + .backportrc.json), then wait
for DRA artifacts on main and the new release branch.

Co-authored-by: Cursor <cursoragent@cursor.com>
Ignore GITHUB_TOKEN so a stale shell export does not override an interactive
gh login session during local version bump testing. CI continues to use
VAULT_GITHUB_TOKEN when gh is not pre-authenticated.

Co-authored-by: Cursor <cursoragent@cursor.com>
Allow BRANCH=testing-MAJOR.MINOR for manual Buildkite runs: version rules
strip the prefix while git ops use the full ref, and main bump plus DRA wait
are skipped so sandbox testing does not touch production branches.

Co-authored-by: Cursor <cursoragent@cursor.com>
Replace exec-in-pipeline with if/else so WORKFLOW=minor only uploads the
minor follow-up steps once, avoiding duplicate Buildkite step keys.

Co-authored-by: Cursor <cursoragent@cursor.com>
Define version_bump_set_main_bump_changed before the testing-* early exit
so WORKFLOW=minor sandbox runs exit 0 instead of command-not-found.

Co-authored-by: Cursor <cursoragent@cursor.com>
Emit branch and PR lines as separate indented heredoc lines so the
uploaded notify pipeline parses when main bump is skipped on sandbox runs.

Co-authored-by: Cursor <cursoragent@cursor.com>
Buildkite executes each command array element as a separate shell line;
splitting python3 and the script path started an interactive REPL and
blocked the fetch-dra-artifacts step until timeout.

Co-authored-by: Cursor <cursoragent@cursor.com>
Nothing uploads this script since elastic#3030 moved version-bump Slack notify to
send_slack_version_bump_notification.sh in phase 2.

Co-authored-by: Cursor <cursoragent@cursor.com>
Remove duplicated helpers from bump_version.sh and validate_version_bump_params.sh so all shared functions live in one place.

Co-authored-by: Cursor <cursoragent@cursor.com>
@prodsecmachine

prodsecmachine commented Jul 2, 2026

Copy link
Copy Markdown

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@elasticsearchmachine

Copy link
Copy Markdown

Pinging @elastic/ml-core (Team:ML)

Tag bump PRs with ci:skip-es-tests and omit Java integration test pipeline
uploads when that label or version-bump topic branch names are detected.

Co-authored-by: Cursor <cursoragent@cursor.com>
@edsavage

edsavage commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

buildkite build this

edsavage and others added 2 commits July 3, 2026 13:29
Buildkite uses author+branch/with/slashes for fork PRs; only split on the
first '+' when '/' is present. Also consult GITHUB_PR_BRANCH from the PR bot.

Co-authored-by: Cursor <cursoragent@cursor.com>
Subprocess tests must not inherit GITHUB_PR_BRANCH/BUILDKITE_BRANCH from
the agent when asserting default ES IT scheduling behaviour.

Co-authored-by: Cursor <cursoragent@cursor.com>
Reuse ci:skip-es-tests / version-bump branch detection to omit the extra
Linux debug build and test steps on metadata-only bump PRs.

Co-authored-by: Cursor <cursoragent@cursor.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the ml-cpp-version-bump Buildkite automation to support minor (feature-freeze) version bumps in addition to the existing patch workflow. It adds minor-freeze validation, orchestrates “create release branch + bump main + update backport config”, updates Slack/DRA gating, and trims PR CI for automated version-bump PRs.

Changes:

  • Add WORKFLOW=minor support: validate main/branch/version state, create the release branch, and bump main to the next minor (including .backportrc.json updates).
  • Refactor/centralize shared shell helpers into dev-tools/version_bump_lib.sh and update bump/validation scripts accordingly.
  • Update Buildkite pipeline generation and PR CI gating to skip expensive ES/Java steps for automated version-bump PRs (via label/topic-branch detection), with expanded unit test coverage.

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
dev-tools/wait_version_bump_dra.py Extend DRA polling logic for minor-freeze artifact sets; add sandbox-branch skip and shared validation import.
dev-tools/version_bump_validation.py Add minor-freeze validation + main-version derivation helpers; support testing-* sandbox branch identity.
dev-tools/version_bump_upload_phase2.sh Upload the correct phase-2 pipeline depending on WORKFLOW (patch vs minor).
dev-tools/version_bump_lib.sh New shared bash helpers for trimming, git config, meta-data, sed-inplace, and version reads.
dev-tools/validate_version_bump_params.sh Add minor-freeze validation path, sandbox handling, and meta-data for phase-2 legs.
dev-tools/update_backportrc.py New script to update .backportrc.json during minor feature freeze.
dev-tools/unittest/test_wait_version_bump_dra.py Update patch-wait output assertion; add sandbox/minor skip test.
dev-tools/unittest/test_version_bump_validation.py Add tests for sandbox branches, minor-freeze validation, and main-bump validation/derivation.
dev-tools/unittest/test_version_bump_upload_phase2.py New test coverage ensuring only one phase-2 pipeline upload per workflow.
dev-tools/unittest/test_ml_pipeline_config.py New tests for PR CI gating (label/branch normalization) and pipeline JSON behavior.
dev-tools/unittest/test_job_version_bump_pipeline.py Add minor phase-2 pipeline tests; align DRA command encoding with Buildkite semantics.
dev-tools/create_minor_branch.sh New “Leg A” script to create the release branch from main for minor freeze.
dev-tools/create_github_pull_request.sh Add --label support; adjust auth behavior to prefer gh auth login locally and Vault token in CI.
dev-tools/bump_version.sh Use shared helpers; add ci:skip-es-tests label to created PRs.
dev-tools/bump_main_minor_freeze.sh New “Leg B” script to bump main to next minor and update .backportrc.json via PR.
.buildkite/pipelines/send_version_bump_notification.sh Remove obsolete legacy Slack notification pipeline.
.buildkite/pipelines/send_slack_version_bump_notification.sh Enhance Slack message logic for minor workflow and sandbox/DRY_RUN cases.
.buildkite/pipelines/build_linux.json.py Skip extra debug build/test steps when the PR is a version-bump PR.
.buildkite/pipeline.json.py Omit ES/IT pipeline uploads when version-bump PR CI should be skipped.
.buildkite/ml_pipeline/config.py Add label/branch-based detection for skipping version-bump PR CI; branch normalization helpers.
.buildkite/job-version-bump-phase2.json.py Adjust DRA wait command formatting to a single shell line.
.buildkite/job-version-bump-phase2-minor.json.py New phase-2 pipeline generator for WORKFLOW=minor (branch creation + main bump + Slack + DRA).
Comments suppressed due to low confidence (1)

.buildkite/ml_pipeline/config.py:275

  • _apply_serverless_kv_from_comment() is no longer a Config method: it ended up indented under should_skip_version_bump_pr_ci() after the early return, so self._apply_serverless_kv_from_comment() (called in parse_comment) will raise AttributeError. This also leaves a nested function definition after a return, which is easy to miss in review.
    def _apply_serverless_kv_from_comment(self):
        """Copy whitelisted KEY=value tokens from the PR comment regex capture into os.environ."""

        env_key = "GITHUB_PR_COMMENT_VAR_SERVERLESS_KV"
        if env_key not in os.environ:

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread dev-tools/create_minor_branch.sh Outdated
Comment on lines +64 to +72
if ! "$PYTHON" "$VALIDATION_PY" validate-minor-freeze \
--main-version "$main_version" \
--new "$NEW_VERSION" \
--branch "$BRANCH" \
$([[ "$release_branch_exists" == "true" ]] && echo --release-branch-exists) \
${release_branch_version:+--release-branch-version "$release_branch_version"}
then
exit 1
fi
Comment thread dev-tools/bump_main_minor_freeze.sh Outdated
EOF
)"

local -a pr_cmd=(
Fix Config._apply_serverless_kv_from_comment indentation, remove invalid
top-level local in bump_main_minor_freeze.sh, and build validate-minor-freeze
args with a proper array in create_minor_branch.sh.

Co-authored-by: Cursor <cursoragent@cursor.com>
@edsavage

edsavage commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

buildkite build this

@valeriy42 valeriy42 self-requested a review July 3, 2026 12:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants