Skip to content

OCPBUGS-83862: Generate autosizing-disabled machineconfig only for master and worker nodes on OCP 4.20#5885

Merged
openshift-merge-bot[bot] merged 3 commits into
openshift:release-4.20from
aksjadha:OCPBUGS-83862
Jun 2, 2026
Merged

OCPBUGS-83862: Generate autosizing-disabled machineconfig only for master and worker nodes on OCP 4.20#5885
openshift-merge-bot[bot] merged 3 commits into
openshift:release-4.20from
aksjadha:OCPBUGS-83862

Conversation

@aksjadha

@aksjadha aksjadha commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

Fixes: OCPBUGS-83862

- What I did

Summary of changes

- How to verify it

  • Enable autosizing on 4.20 previous cluster, upgrade cluster to 4.20. Verify new MC generated only for master and worker mcp.
  • Debug to node check /etc/node-sizing-enabled.env file on custom pool nodes if change has reverted or not.

- Description for the changelog

  • Restrict autosizing-disabled MachineConfig generation to master and worker pools only, preventing custom pool autosizing configurations set via KubeletConfig from being unintentionally overridden due to MachineConfig precedence rules.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Auto-sizing configuration now applies exclusively to default node pools (master and worker). Custom node pools are excluded from automatic sizing adjustments.
  • Tests

    • Updated test coverage to validate that auto-sizing applies only to default pools and that custom pools are properly skipped.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 27, 2026
@coderabbitai

coderabbitai Bot commented Apr 27, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 237f638a-5efc-426c-b57c-3980d57b6ddd

📥 Commits

Reviewing files that changed from the base of the PR and between c19e5dc and 1473263.

📒 Files selected for processing (3)
  • pkg/controller/kubelet-config/kubelet_config_autosizing.go
  • pkg/controller/kubelet-config/kubelet_config_autosizing_test.go
  • pkg/controller/kubelet-config/kubelet_config_controller.go

Walkthrough

The PR restricts auto-sizing MachineConfig generation to only the default master and worker pools. The implementation now explicitly iterates over these pool names instead of listing all available pools; non-default pools are skipped with logging during bootstrap. Tests are updated to validate this narrowed scope.

Changes

Restrict auto-sizing MachineConfigs to default pools

Layer / File(s) Summary
Core implementation - pool scope restriction
pkg/controller/kubelet-config/kubelet_config_autosizing.go, pkg/controller/kubelet-config/kubelet_config_controller.go
ensureAutoSizingMachineConfigs iterates only over master and worker pool names and fetches each individually. RunAutoSizingBootstrap skips non-default pools with logging. The labels import is removed. A comment in Controller.Run is updated to specify auto-sizing configs exist only for master and worker.
Test validation of pool restriction
pkg/controller/kubelet-config/kubelet_config_autosizing_test.go
TestEnsureAutoSizingMachineConfigs verifies exactly two MachineConfigs are created for master and worker. TestRunAutoSizingBootstrap adds a subtest that confirms non-default pools are skipped and only master/worker configs are returned.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 12
✅ Passed checks (12 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main change: restricting autosizing-disabled MachineConfig generation to master and worker pools only, addressing OCPBUGS-83862 for OCP 4.20.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test names in the modified test file are stable, descriptive static strings with no dynamic information like generated IDs, timestamps, or other runtime-dependent values.
Test Structure And Quality ✅ Passed Tests show strong quality: clear single responsibility per test, meaningful assertion messages, proper use of package fixture pattern, no indefinite waits, and consistent with codebase patterns.
Microshift Test Compatibility ✅ Passed No new Ginkgo e2e tests were added. The PR adds standard Go unit tests using testing.T framework, not Ginkgo-based e2e tests, so the MicroShift compatibility check does not apply.
Single Node Openshift (Sno) Test Compatibility ✅ Passed PR contains only unit tests in pkg/controller/kubelet-config/, not Ginkgo e2e tests. SNO check does not apply to unit test modifications.
Topology-Aware Scheduling Compatibility ✅ Passed PR modifies MachineConfig (node-level) generation only, not pod scheduling. No pod affinity, topology spread, nodeSelectors, or topology-dependent constraints introduced.
Ote Binary Stdout Contract ✅ Passed Library package files with no process-level code. klog calls write to stderr by default; no fmt.Print* calls to stdout.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed This PR contains only standard Go unit tests (testing.T), not Ginkgo e2e tests. The custom check only applies to Ginkgo e2e tests with IPv4/connectivity issues.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 27, 2026
@openshift-ci

openshift-ci Bot commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

Hi @aksjadha. Thanks for your PR.

I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@aksjadha aksjadha changed the title Ocpbugs 83862 OCPBUGS-83862: Generate autosizing-disabled machineconfig only for master and worker nodes on OCP 4.20 Apr 28, 2026
@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. labels Apr 28, 2026
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@aksjadha: This pull request references Jira Issue OCPBUGS-83862, which is invalid:

  • expected the bug to target the "4.20.z" version, but no target version was set
  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.
  • expected Jira Issue OCPBUGS-83862 to depend on a bug targeting a version in 4.21.0, 4.21.z and in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but no dependents were found

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

- What I did

- How to verify it

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Apr 28, 2026
Comment thread pkg/controller/kubelet-config/kubelet_config_autosizing.go Outdated

@ngopalak-redhat ngopalak-redhat left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the description explaining why this change is made and what it fixes?

// RunAutoSizingBootstrap generates auto-sizing MachineConfig objects for master and worker mcpPools
func RunAutoSizingBootstrap(mcpPools []*mcfgv1.MachineConfigPool) ([]*mcfgv1.MachineConfig, error) {
configs := make([]*mcfgv1.MachineConfig, 0, len(mcpPools))
var configs []*mcfgv1.MachineConfig

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this changed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change from make([]*mcfgv1.MachineConfig, 0, len(mcpPools)) to var configs []*mcfgv1.MachineConfig was made because the pre-allocated capacity is no longer accurate.

  • Before: Every pool produced a config, so len(mcpPools) was the exact capacity needed.
  • After: This change skips non-master/non-worker pools, so the actual number of configs will be less than or equal to len(mcpPools). Pre-allocating len(mcpPools) slots would over-allocate for any custom pools passed in.

The optimization of pre-allocating capacity is not required. A simple var configs []*mcfgv1.MachineConfig (nil slice, grown via append) avoids implying the slice.

Comment thread pkg/controller/kubelet-config/kubelet_config_autosizing_test.go
@aksjadha aksjadha marked this pull request as ready for review May 20, 2026 14:09
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 20, 2026
@openshift-ci openshift-ci Bot requested review from RishabhSaini and yuqi-zhang May 20, 2026 14:11
@ngopalak-redhat

Copy link
Copy Markdown
Contributor

/ok-to-test

@openshift-ci openshift-ci Bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 21, 2026
@ngopalak-redhat

Copy link
Copy Markdown
Contributor

/assign @sairameshv

@ngopalak-redhat

Copy link
Copy Markdown
Contributor

@coderabbitai review

@coderabbitai

coderabbitai Bot commented May 21, 2026

Copy link
Copy Markdown
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@ngopalak-redhat

Copy link
Copy Markdown
Contributor

/jira refresh

@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@ngopalak-redhat: This pull request references Jira Issue OCPBUGS-83862, which is invalid:

  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.
  • expected Jira Issue OCPBUGS-83862 to depend on a bug targeting a version in 4.21.0, 4.21.z and in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but no dependents were found

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot removed the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label May 21, 2026
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@aksjadha: This pull request references Jira Issue OCPBUGS-83862, which is valid. The bug has been moved to the POST state.

7 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.20.z) matches configured target version for branch (4.20.z)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)
  • release note text is set and does not match the template
  • dependent bug Jira Issue OCPBUGS-86309 is in the state Closed (Done), which is one of the valid states (VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA))
  • dependent Jira Issue OCPBUGS-86309 targets the "4.21.0" version, which is one of the valid target versions: 4.21.0, 4.21.z
  • bug has dependents
Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci

openshift-ci Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor

@aksjadha: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-op-ocl 1473263 link false /test e2e-gcp-op-ocl

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@sairameshv sairameshv left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
Changes look good to me

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 21, 2026
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Tests from second stage were triggered manually. Pipeline can be controlled only manually, until HEAD changes. Use command to trigger second stage.

@isabella-janssen isabella-janssen left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci

openshift-ci Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aksjadha, isabella-janssen, sairameshv

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 26, 2026
@aksjadha

aksjadha commented Jun 1, 2026

Copy link
Copy Markdown
Contributor Author

Test result:

On 4.20 cluster, only below 2 mc generated, not custom one.

50-master-auto-sizing-disabled                                                                 3.5.0             57m
50-worker-auto-sizing-disabled                                                                 3.5.0             57m

Kubeletconfig created to enable autosizing on worker and custom pool nodes.

  kind: KubeletConfig
  metadata:
    finalizers:
    - 99-worker-generated-kubelet
  spec:
    autoSizingReserved: true
    machineConfigPoolSelector:
      matchLabels:
        pools.operator.machineconfiguration.openshift.io/worker: ""
custom   rendered-custom-8b0a33572e129c5b2c044616367e6099   True      False      False      1              1                   1                     0                      14m
master   rendered-master-358509548ce0be1085e9b326aa66670e   True      False      False      3              3                   3                     0                      75m
worker   rendered-worker-8b0a33572e129c5b2c044616367e6099   True      False      False      2              2                   2                     0                      75m

With 50-mcp-autosizing-disabled mc, changes were not reverting on custom pool nodes.

ci-ln-blq6lw2-72292-92knw-worker-a-6m5cb   Ready    worker                 65m   v1.33.12

Starting pod/ci-ln-blq6lw2-72292-92knw-worker-a-6m5cb-debug ...
To use host binaries, run `chroot /host`
NODE_SIZING_ENABLED=true
SYSTEM_RESERVED_MEMORY=1Gi
SYSTEM_RESERVED_CPU=500m
SYSTEM_RESERVED_ES=1Gi

ci-ln-blq6lw2-72292-92knw-worker-c-ftkgj   Ready    custom,worker          65m   v1.33.12

ci-ln-blq6lw2-72292-92knw-worker-c-ftkgj
Starting pod/ci-ln-blq6lw2-72292-92knw-worker-c-ftkgj-debug ...
To use host binaries, run `chroot /host`
NODE_SIZING_ENABLED=true
SYSTEM_RESERVED_MEMORY=1Gi
SYSTEM_RESERVED_CPU=500m
SYSTEM_RESERVED_ES=1Gi

@aksjadha

aksjadha commented Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

/verified

@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@aksjadha: The /verified command must be used with one of the following actions: by, later, remove, or bypass. See https://docs.ci.openshift.org/docs/architecture/jira/#premerge-verification for more information.

Details

In response to this:

/verified /backport-risk-assessed

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@aksjadha

aksjadha commented Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

/verified by @aksjadha

Verified the changes through manual testing and have added the test results above.

@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@aksjadha: Jira verification commands are restricted to collaborators for this repo.

Details

In response to this:

/verified by @aksjadha

Verified the changes through manual testing and have added the test results above.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@sairameshv sairameshv left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/verified by @aksjadha

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Jun 2, 2026
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@sairameshv: This PR has been marked as verified by @aksjadha.

Details

In response to this:

/verified by @aksjadha

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@isabella-janssen

Copy link
Copy Markdown
Member

/label backport-risk-assessed

@openshift-ci openshift-ci Bot added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label Jun 2, 2026
@isabella-janssen

Copy link
Copy Markdown
Member

/test e2e-aws-ovn-upgrade
/test e2e-gcp-op-part1
/test e2e-gcp-op-part2
/test e2e-gcp-op-single-node
/test e2e-hypershift

@openshift-merge-bot openshift-merge-bot Bot merged commit a271df1 into openshift:release-4.20 Jun 2, 2026
15 of 16 checks passed
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@aksjadha: Jira Issue Verification Checks: Jira Issue OCPBUGS-83862
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-83862 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

Details

In response to this:

Fixes: OCPBUGS-83862

- What I did

Summary of changes

- How to verify it

  • Enable autosizing on 4.20 previous cluster, upgrade cluster to 4.20. Verify new MC generated only for master and worker mcp.
  • Debug to node check /etc/node-sizing-enabled.env file on custom pool nodes if change has reverted or not.

- Description for the changelog

  • Restrict autosizing-disabled MachineConfig generation to master and worker pools only, preventing custom pool autosizing configurations set via KubeletConfig from being unintentionally overridden due to MachineConfig precedence rules.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

  • Auto-sizing configuration now applies exclusively to default node pools (master and worker). Custom node pools are excluded from automatic sizing adjustments.

  • Tests

  • Updated test coverage to validate that auto-sizing applies only to default pools and that custom pools are properly skipped.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-robot

Copy link
Copy Markdown
Contributor

Fix included in release 4.20.0-0.nightly-2026-06-02-232813

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.