fix(vault-seed): cache generated secrets in K8s Secrets and push from the cache#1981
Merged
Merged
Conversation
… the cache
Generated secrets (dex/flux-web client secrets, oauth2-proxy cookie,
fleetdm bootstrap passwords, umami app/admin/tenant passwords) were
pushed straight from ESO Password generators. A generatorRef-selector
PushSecret yields a NEW random value on every sync, so they had to be
push-once (refreshInterval "0") — which meant the generated values
existed nowhere but OpenBao itself. When the 2026-06-10 incident
re-initialized the vault with an empty KV store, they were simply gone:
no self-healing path, every consumer ExternalSecret stuck in
SecretSyncedError.
Split each into two steps:
1. a generated-* ExternalSecret runs the Password generator exactly
once (refreshInterval "0") and persists the value in a durable
cache Secret in the openbao namespace
2. the existing PushSecret mirrors that cache Secret into OpenBao
hourly (refreshInterval 1h) — same pattern as
push-umami-db-superuser and the SOPS-sourced seeds
Healthy vault: the hourly push is a no-op (values never rotate).
Wiped vault: the KV store re-seeds itself within the hour.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Contributor
Author
ℹ️ The 🧪 System Test failure here is pre-existing on main, not caused by this PR: every CI run since ~18:30 UTC today fails identically (including the unrelated Renovate PRs #1971/#1972/#1974). The test cluster's |
Contributor
|
🎉 This PR is included in version 1.47.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Root cause
The generated secrets (dex / flux-web client secrets, oauth2-proxy cookie secret, fleetdm bootstrap passwords, umami app/admin/tenant passwords) were pushed straight from ESO Password generators. A
generatorRef-selector PushSecret produces a new random value on every sync, so they had to be push-once (refreshInterval: "0") — which made OpenBao the only place the values existed. When the 2026-06-10 incident re-initialized the vault with an empty KV store (context in #1979/#1980), they were unrecoverable, and every consumer ExternalSecret (vault-config-oidc,headlamp-oidc,actual-budget-oidc,umami-admin, …) wedged inSecretSyncedErrorwith no self-healing path.Fix
Each generated secret becomes two steps (the proven
push-umami-db-superuserpattern):generated-*ExternalSecret runs the Password generator exactly once (refreshInterval: "0") and persists the value in a durable cache Secret in theopenbaonamespacerefreshInterval: 1h, selector switched fromgeneratorRefto the cache Secret)Healthy vault → hourly push is a no-op; values never rotate. Wiped vault → KV re-seeds within the hour from the cache.
Because the old generated values died with the wiped vault, the caches will generate fresh values on first reconcile. All consumers converge automatically via their ExternalSecrets (SSO clients, cookie secret → existing sessions reset; fleetdm has no pods yet, harmless), except:
umamipassword. After this merges, the admin password must be updated once in the umami UI (log in with the old value from the currentumami/umami-adminSecret, set it to the new value fromopenbao/generated-umami-admin-password). The provision-tenants CronJob then goes green again.Validation
kubectl kustomizelocal + prod ✅ksail workload validate→ 314 files validated ✅kubectl apply --dry-run=clienton the changed file against live CRDs ✅🤖 Generated with Claude Code