Skip to content

Preserve Helm release values on rad upgrade kubernetes (#11218)#12029

Open
nicolejms wants to merge 9 commits into
mainfrom
nicolejms/fix-11218-rad-upgrade-reuse-values
Open

Preserve Helm release values on rad upgrade kubernetes (#11218)#12029
nicolejms wants to merge 9 commits into
mainfrom
nicolejms/fix-11218-rad-upgrade-reuse-values

Conversation

@nicolejms

@nicolejms nicolejms commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Description

rad upgrade kubernetes was passing chart defaults + current --set args to helm upgrade with no reuse flags, so any previously-stored user values (e.g. global.azureWorkloadIdentity.enabled=true, database.enabled=true) were silently reset to chart defaults on every upgrade.

Root cause

  1. pkg/cli/helm/radius.go::addArgsFromCLI parsed --set / --set-file directly into helmChart.Values, mutating the chart-default tree.
  2. pkg/cli/helm/helmclient.go::RunHelmUpgrade called upgradeClient.Run(name, chart, helmChart.Values) without setting any ReuseValues / ResetThenReuseValues. With no reuse flag, Helm treats the passed map as the complete user-supplied override set and discards anything stored on the previous release.

Fix

  • Stop mutating helmChart.Values from --set / --set-file. Parse user overrides into a separate map[string]any and pass them to Helm as vals.
  • Default upgradeClient.ResetThenReuseValues = true so Helm starts from the new chart defaults, overlays the user-supplied values stored on the previous release, and then overlays the vals from this upgrade — exactly the semantics requested in the issue. See the Helm upgrade docs for details on --reset-then-reuse-values behavior.
  • Add a --reset-values flag on rad upgrade kubernetes as the explicit opt-out (mirrors the helm CLI escape hatch).
  • Apply the same defaults / overrides separation to Contour so HostNetwork no longer pollutes stored release values.

Behavioral notes

  • First upgrade after this fix: existing releases were stored with mutated helmChart.Values as "user-supplied values". On the first upgrade after the fix, those previously-polluted values are replayed once. From the next upgrade onward the stored user values are clean. This matches helm upgrade --reuse-values semantics and is the right trade-off (preserves user intent, no surprise resets).
  • Users who want today's reset-on-upgrade behavior can pass --reset-values.

Tests

  • Added Test_prepareRadiusChart_DoesNotMutateChartValues as a regression guard — calls prepareRadiusChart with a mocked HelmClient returning a chart with non-empty Values, asserts (1) the returned vals contains the CLI overrides and (2) helmChart.Values remains unchanged.
  • Added Test_Helm_UpgradeRadius_ResetValues to verify that --reset-values flips reuseValues to false on the helm client call.
  • Added comprehensive unit tests to achieve 100% coverage on all changed functions:
    • Test_parseUserValuesFromCLI_InvalidSetArg, Test_parseUserValuesFromCLI_InvalidSetFileArg, Test_parseUserValuesFromCLI_EmptyArgs — error and edge-case paths for CLI value parsing.
    • Test_prepareRadiusChart_LoadChartError, Test_prepareRadiusChart_ParseValuesError — error paths in chart preparation.
    • Test_prepareContourChart_LoadChartError, Test_prepareContourChart_DoesNotMutateChartValues — Contour chart error path and immutability assertion.
    • Test_ApplyHelmChart_InstallError, Test_ApplyHelmChart_ReinstallPath, Test_ApplyHelmChart_ReinstallError, Test_ApplyHelmChart_AlreadyInstalled_NoReinstall, Test_ApplyHelmChart_QueryReleaseError — full coverage of install/reinstall/no-op/error paths.
    • Test_PopulateDefaultClusterOptions_ContourOverrides — Contour option override branches.
    • Test_Helm_UpgradeRadius_RadiusUpgradeError, Test_Helm_UpgradeRadius_ContourUpgradeError, Test_Helm_UpgradeRadius_CheckInstallError — upgrade error paths.
  • Updated existing helm package tests and mocks to thread the new vals and reuseValues parameters.
  • go test ./pkg/cli/helm/... and go test ./pkg/cli/cmd/upgrade/kubernetes/... pass.

Type of change

  • This pull request fixes a bug in Radius and has an approved issue (issue link required).

Contributor checklist

Please verify that the PR meets the following requirements, where applicable:

  • An overview of proposed schema changes is included in a linked GitHub issue.
    • Yes
    • Not applicable
  • A design document is added or updated under eng/design-notes/ in this repository, if new APIs are being introduced.
    • Yes
    • Not applicable
  • The design document has been reviewed and approved by Radius maintainers/approvers.
    • Yes
    • Not applicable
  • A PR for resource-types-contrib is created, if resource types or recipes are affected by the changes in this PR.
    • Yes
    • Not applicable
  • A PR for dashboard is created, if the Radius Dashboard is affected by the changes in this PR.
    • Yes
    • Not applicable
  • A PR for the documentation repository is created, if the changes in this PR affect the documentation or any user facing updates are made.
    • Yes
    • Not applicable

Copilot AI review requested due to automatic review settings June 1, 2026 18:38
@nicolejms nicolejms requested review from a team as code owners June 1, 2026 18:38
@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes rad upgrade kubernetes overwriting previously stored Helm release values by separating chart defaults from CLI-provided overrides and enabling Helm “reset-then-reuse-values” behavior by default, with an explicit --reset-values opt-out.

Changes:

  • Parse --set/--set-file into a fresh vals map (no longer mutating helmChart.Values) and pass vals into Helm install/upgrade calls.
  • Default upgrades to ResetThenReuseValues semantics (preserve previously stored user overrides while picking up new chart defaults), with --reset-values to discard stored values.
  • Apply the same “defaults vs overrides” separation for the Contour chart; update mocks and tests for the new Helm client signatures/behavior.

Reviewed changes

Copilot reviewed 9 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
pkg/cli/helm/radius.go Returns a separate user-values map parsed from CLI instead of mutating chart defaults.
pkg/cli/helm/helmclient.go Extends HelmClient install/upgrade APIs to accept vals and a reuseValues toggle; sets ResetThenReuseValues.
pkg/cli/helm/helmaction.go Threads vals into install/upgrade and defaults reinstall upgrades to reuse prior stored values.
pkg/cli/helm/cluster.go Adds ResetValues option, plumbs it through upgrade logic, and passes per-invocation vals maps.
pkg/cli/cmd/upgrade/kubernetes/kubernetes.go Adds --reset-values flag and documents the new default upgrade value-preservation behavior.
pkg/cli/helm/contour.go Builds a fresh Contour override map (no chart mutation) for HostNetwork-related settings.
pkg/cli/helm/contour_test.go Updates tests to validate the new buildContourValues behavior.
pkg/cli/helm/helmaction_test.go Updates tests for the new CLI-values parsing helper and adds a regression test (see review comment).
pkg/cli/helm/cluster_test.go Updates mocks/expectations and adds coverage for the --reset-values behavior.
pkg/cli/helm/mock_helmclient.go Regenerated mock to match updated HelmClient method signatures.
pkg/cli/helm/mock_cluster.go Regenerated mock (parameter naming/marker field updates).
Files not reviewed (2)
  • pkg/cli/helm/mock_cluster.go: Language not supported
  • pkg/cli/helm/mock_helmclient.go: Language not supported

Comment thread pkg/cli/helm/helmaction_test.go Outdated
`rad upgrade kubernetes` was passing chart defaults + current --set args to `helm upgrade` with no reuse flags, so any previously-stored user values (e.g. `global.azureWorkloadIdentity.enabled=true`) were silently reset to chart defaults on every upgrade.

Fix:

- Stop mutating `helmChart.Values` from --set/--set-file. Parse user overrides into a separate map and pass them to Helm as `vals`.

- Set `upgradeClient.ResetThenReuseValues = true` by default so Helm starts from new chart defaults, overlays previously-stored user values, and then overlays the current --set overrides.

- Add a `--reset-values` flag on `rad upgrade kubernetes` as an explicit opt-out (matches the `helm` CLI escape hatch).

- Apply the same separation to Contour so HostNetwork no longer pollutes stored release values.

Fixes #11218

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: aeroNJneer <nicolej@microsoft.com>
@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown

Unit Tests

    2 files  ± 0    438 suites  ±0   7m 30s ⏱️ +14s
5 340 tests +18  5 338 ✅ +18  2 💤 ±0  0 ❌ ±0 
6 494 runs  +18  6 492 ✅ +18  2 💤 ±0  0 ❌ ±0 

Results for commit 267f0ec. ± Comparison against base commit 1dbd9a1.

This pull request removes 3 and adds 21 tests. Note that renamed tests count towards both.
github.com/radius-project/radius/pkg/cli/helm ‑ TestAddContourValues_HostNetworkDisabled_NoChange
github.com/radius-project/radius/pkg/cli/helm ‑ TestAddContourValues_HostNetworkEnabled
github.com/radius-project/radius/pkg/cli/helm ‑ Test_addArgsFromCLI
github.com/radius-project/radius/pkg/cli/helm ‑ TestBuildContourValues_HostNetworkDisabled_ReturnsEmpty
github.com/radius-project/radius/pkg/cli/helm ‑ TestBuildContourValues_HostNetworkEnabled
github.com/radius-project/radius/pkg/cli/helm ‑ Test_ApplyHelmChart_AlreadyInstalled_NoReinstall
github.com/radius-project/radius/pkg/cli/helm ‑ Test_ApplyHelmChart_InstallError
github.com/radius-project/radius/pkg/cli/helm ‑ Test_ApplyHelmChart_QueryReleaseError
github.com/radius-project/radius/pkg/cli/helm ‑ Test_ApplyHelmChart_ReinstallError
github.com/radius-project/radius/pkg/cli/helm ‑ Test_ApplyHelmChart_ReinstallPath
github.com/radius-project/radius/pkg/cli/helm ‑ Test_Helm_UpgradeRadius_CheckInstallError
github.com/radius-project/radius/pkg/cli/helm ‑ Test_Helm_UpgradeRadius_ContourUpgradeError
github.com/radius-project/radius/pkg/cli/helm ‑ Test_Helm_UpgradeRadius_RadiusUpgradeError
…

♻️ This comment has been updated with latest results.

@codecov

codecov Bot commented Jun 3, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 82.85714% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.25%. Comparing base (1dbd9a1) to head (267f0ec).

Files with missing lines Patch % Lines
pkg/cli/helm/helmclient.go 0.00% 9 Missing ⚠️
pkg/cli/helm/contour.go 91.30% 2 Missing ⚠️
pkg/cli/helm/radius.go 92.30% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #12029      +/-   ##
==========================================
+ Coverage   52.13%   52.25%   +0.12%     
==========================================
  Files         734      734              
  Lines       46704    46717      +13     
==========================================
+ Hits        24350    24414      +64     
+ Misses      20017    19992      -25     
+ Partials     2337     2311      -26     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@nicolejms

Copy link
Copy Markdown
Contributor Author

@copilot add dev unit tests to bring code coverage too 100% for the changes

Copilot AI added 2 commits June 3, 2026 17:48
- parseUserValuesFromCLI: error paths for invalid --set and --set-file args
- prepareRadiusChart: error paths for chart load and value parse failures
- prepareContourChart: error path and immutability assertion
- ApplyHelmChart: install error, reinstall path, reinstall error, no-op, query error
- PopulateDefaultClusterOptions: Contour overrides branch
- UpgradeRadius: Radius upgrade error, Contour upgrade error, CheckInstall error

Copilot AI commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

@copilot add dev unit tests to bring code coverage too 100% for the changes

Added comprehensive unit tests in commit fix: correct indentation in ApplyHelmChart test functions. Coverage results for PR-changed functions:

  • PopulateDefaultClusterOptions: 100%
  • ApplyHelmChart: 100%
  • buildContourValues: 100%
  • prepareRadiusChart: 92.9% (remaining: initHelmConfig error — internal infra dep)
  • parseUserValuesFromCLI: 92.3% (remaining: runtime.GOOS == "windows" branch — untestable on Linux)
  • UpgradeRadius: 92% (remaining: prepareContourChart internal error path)
  • prepareContourChart: 85.7% (remaining: initHelmConfig error)

The only uncovered lines are either OS-gated (runtime.GOOS) or deep infrastructure (initHelmConfig creating Kubernetes config objects) that cannot be exercised without a real cluster or extensive internal mocking.

@radius-functional-tests

radius-functional-tests Bot commented Jun 8, 2026

Copy link
Copy Markdown

Radius functional test overview

🔍 Go to test action run

Click here to see the test run details
Name Value
Repository radius-project/radius
Commit ref 267f0ec
Unique ID func79fc2b0f0d
Image tag pr-func79fc2b0f0d
  • gotestsum 1.13.0
  • KinD: v0.29.0
  • Dapr: 1.14.4
  • Azure KeyVault CSI driver: 1.4.2
  • Azure Workload identity webhook: 1.3.0
  • Bicep recipe location ghcr.io/radius-project/dev/test/testrecipes/test-bicep-recipes/<name>:pr-func79fc2b0f0d
  • Terraform recipe location http://tf-module-server.radius-test-tf-module-server.svc.cluster.local/<name>.zip (in cluster)
  • applications-rp test image location: ghcr.io/radius-project/dev/applications-rp:pr-func79fc2b0f0d
  • dynamic-rp test image location: ghcr.io/radius-project/dev/dynamic-rp:pr-func79fc2b0f0d
  • controller test image location: ghcr.io/radius-project/dev/controller:pr-func79fc2b0f0d
  • ucp test image location: ghcr.io/radius-project/dev/ucpd:pr-func79fc2b0f0d
  • deployment-engine test image location: ghcr.io/radius-project/deployment-engine:latest

Test Status

⌛ Building Radius and pushing container images for functional tests...
✅ Container images build succeeded
⌛ Publishing Bicep Recipes for functional tests...
✅ Recipe publishing succeeded
⌛ Starting ucp-cloud functional tests...
⌛ Starting corerp-cloud functional tests...
✅ ucp-cloud functional tests succeeded
✅ corerp-cloud functional tests succeeded

@nithyatsu nithyatsu requested a review from willdavsmith June 8, 2026 18:12

@willdavsmith willdavsmith left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking pretty good. left a couple of comments/questions

Comment thread pkg/cli/helm/cluster.go
// By default it preserves the user-supplied values stored on the previous Helm release (ResetThenReuseValues
// semantics): the upgrade starts from the new chart defaults, re-applies the previously stored user overrides,
// and then overlays the current invocation's --set / --set-file values. Set ClusterOptions.ResetValues to true
// to opt out and use only the current invocation's overrides.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a Helm docs link that we could add here on the ResetThenReuseValues semantics?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 8a6a9c2. Added the link https://helm.sh/docs/helm/helm_upgrade/#options to the UpgradeRadius function doc comment, the RunHelmUpgrade interface doc, and the implementation doc.

// Mock Helm Install
mockHelmClient.EXPECT().RunHelmInstall(gomock.AssignableToTypeOf(&helm.Configuration{}), gomock.AssignableToTypeOf(&chart.Chart{}), "radius", "radius-system", true).Return(radiusRelease, nil).Times(1)
mockHelmClient.EXPECT().RunHelmInstall(gomock.AssignableToTypeOf(&helm.Configuration{}), gomock.AssignableToTypeOf(&chart.Chart{}), "contour", "radius-system", false).Return(contourRelease, nil).Times(1)
mockHelmClient.EXPECT().RunHelmInstall(gomock.AssignableToTypeOf(&helm.Configuration{}), gomock.AssignableToTypeOf(&chart.Chart{}), gomock.Any(), "radius", "radius-system", true).Return(radiusRelease, nil).Times(1)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why were these changed to gomock.Any()?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The vals parameter (map[string]any) is now dynamically built by prepareRadiusChart/prepareContourChart from the chart options at runtime, so the exact map contents depend on the test's CLIClusterOptions. Using gomock.Any() avoids brittle tests that break if we add or rename a default value. The important assertions about vals content are covered in the dedicated Test_prepareRadiusChart_DoesNotMutateChartValues and Test_prepareContourChart_DoesNotMutateChartValues tests which verify the exact map contents returned by the prepare functions.

Comment thread pkg/cli/helm/contour.go
"dnsPolicy": "ClusterFirstWithHostNet",
// Container ports for the Envoy pod must be 80/443 to allow Envoy to bind
// directly on the host network.
"containerPorts": map[string]any{

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot are these changes in contour.go going to cause regressions? or are they correct w.r.t. the previous implementation

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are correct and won't cause regressions. The old addContourValues mutated helmChart.Values in place (navigating into nested maps like helmChart.Values["envoy"].(map[string]any) and writing keys directly). The new buildContourValues produces the exact same key/value structure but as a standalone map that gets passed to Helm as user overrides via the vals parameter.

The end result at render time is identical — Helm merges vals on top of chart defaults — but now the chart's default Values map stays clean. This means ResetThenReuseValues works correctly on subsequent upgrades (only the explicit overrides get stored as "user values" on the release, not the entire mutated defaults tree).

The old code also had defensive nil checks that would return errors if chart values didn't have the expected nested structure (e.g. envoy.containerPorts node not found). Those can't fail in the new code because we're building the map from scratch rather than navigating an existing tree.

// RunHelmInstall installs the Helm chart using the supplied user-values map as the
// override set. The map should contain only user-supplied overrides (not the chart
// defaults); Helm merges them on top of the chart's defaults at render time.
RunHelmInstall(helmConf *helm.Configuration, helmChart *chart.Chart, vals map[string]any, releaseName, namespace string, wait bool) (*release.Release, error)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a better data model for vals besides map[string]any? is there a type/interface to use from the helm package? I assume that we would only accept specific values here (helm cli values)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Helm's own SDK uses map[string]interface{} (i.e., map[string]any) throughout — see action.Install.Run() and action.Upgrade.Run() which both accept vals map[string]interface{}. There's no typed struct or interface in the Helm package for values; they're always an untyped nested map matching the YAML structure. Using map[string]any here is consistent with what the Helm SDK expects and what strvals.Parse() returns.

Copilot AI requested a review from willdavsmith June 8, 2026 22:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants