design-proposal: hybrid kubernetes clusters (Phase 3 placeholder)#9
Conversation
Placeholder for the future Phase 3 of the kubernetes-application reshape: workers in external environments (cloud autoscaling, BYO clusters, bare metal). Deferred until Phase 1 + Phase 2 in PR cozystack#8 land. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a design proposal for Phase 3 of the hybrid Kubernetes cluster implementation, focusing on supporting worker nodes in external environments such as public clouds and on-premises datacenters. The feedback provided focuses on improving the technical documentation's clarity and grammatical precision, including corrections for terminology like 'on-premises', ensuring proper pronoun agreement, removing redundant phrasing, and standardizing US English spelling.
|
|
||
| - **External cloud workers**: a tenant cluster running its control-plane in Cozystack (Kamaji) but its worker nodes as cloud VMs in Hetzner, Azure, AWS, GCP, etc. Driven by `cluster-autoscaler` with the cloud's native provider, not by CAPI. | ||
| - **BYO clusters**: tenants who bring their own cloud account and want their pool to be billed against that account rather than the Cozystack platform's. Implies admin-managed *or* tenant-managed location ownership. | ||
| - **Bare metal / on-premise workers**: a tenant wanting nodes in their own datacenter joined to a Cozystack-hosted control-plane. |
There was a problem hiding this comment.
In technical documentation, "on-premises" is the correct adjective to describe hardware or software located on the site of an organization. "On-premise" refers to a thought or proposition. Also, "tenant" is singular, so "its own" is more appropriate than "their own".
| - **Bare metal / on-premise workers**: a tenant wanting nodes in their own datacenter joined to a Cozystack-hosted control-plane. | |
| - **Bare metal / on-premises workers**: a tenant wanting nodes in its own datacenter joined to a Cozystack-hosted control-plane. |
| - **BYO clusters**: tenants who bring their own cloud account and want their pool to be billed against that account rather than the Cozystack platform's. Implies admin-managed *or* tenant-managed location ownership. | ||
| - **Bare metal / on-premise workers**: a tenant wanting nodes in their own datacenter joined to a Cozystack-hosted control-plane. | ||
|
|
||
| The Novolos use case is the concrete driving example: workers in different tenant clouds, each with their own `cluster-autoscaler`, all joining a single managed Kamaji control-plane. |
There was a problem hiding this comment.
Grammatical correction: "each" is singular and should be followed by "its own" rather than "their own".
| The Novolos use case is the concrete driving example: workers in different tenant clouds, each with their own `cluster-autoscaler`, all joining a single managed Kamaji control-plane. | |
| The Novolos use case is the concrete driving example: workers in different tenant clouds, each with its own cluster-autoscaler, all joining a single managed Kamaji control-plane. |
|
|
||
| 1. The package split delivered by Phase 2 (PR #8) is the architectural seam Phase 3 needs. Designing external backends before the split is in place forces shoehorning them into the monolithic `kubernetes` chart's `nodeGroups`, which doesn't fit semantically and burns design effort that Phase 2 reclaims. | ||
| 2. The Talos worker base delivered by Phase 1 (PR #8) is what makes external workers tractable in the first place. Ubuntu + kubeadm joining a remote Kamaji cluster is operationally awkward; Talos + machineconfig over cloud-init is the path of least resistance for both KubeVirt VMs (in-cluster) and cloud VMs (external). | ||
| 3. Several open Cozystack-side decisions (admin- vs tenant-owned location ownership, credential model for BYO clouds, default deny vs explicit advertise, dashboard surfacing) are best made with concrete Phase 1 + 2 operational experience in hand, not in advance. |
There was a problem hiding this comment.
The phrase "tenant-owned location ownership" is redundant. Using "tenant-level location ownership" or "ownership of tenant-owned locations" is more concise.
| 3. Several open Cozystack-side decisions (admin- vs tenant-owned location ownership, credential model for BYO clouds, default deny vs explicit advertise, dashboard surfacing) are best made with concrete Phase 1 + 2 operational experience in hand, not in advance. | |
| 3. Several open Cozystack-side decisions (admin- vs tenant-level location ownership, credential model for BYO clouds, default deny vs explicit advertise, dashboard surfacing) are best made with concrete Phase 1 + 2 operational experience in hand, not in advance. |
|
|
||
| Several patterns were raised during early discussion of PR #8. They are listed here so the conversation does not restart from zero when work resumes, but **none of them is committed**. | ||
|
|
||
| - **New `backend.type` field in `kubernetes-nodes`.** The single-backend "kubevirt-talos" shape from Phase 2 grows a discriminator: `kubevirt-talos`, `cloud-talos-hetzner`, `cloud-talos-azure`, etc. Per-backend sub-charts realise the actual lifecycle (CAPI for KubeVirt-VM backends; `cluster-autoscaler` directly against the cloud's native API for cloud backends). |
There was a problem hiding this comment.
The project appears to follow US English spelling conventions (e.g., "reshape", "autoscaler"). "Realize" should be used instead of "realise".
| - **New `backend.type` field in `kubernetes-nodes`.** The single-backend "kubevirt-talos" shape from Phase 2 grows a discriminator: `kubevirt-talos`, `cloud-talos-hetzner`, `cloud-talos-azure`, etc. Per-backend sub-charts realise the actual lifecycle (CAPI for KubeVirt-VM backends; `cluster-autoscaler` directly against the cloud's native API for cloud backends). | |
| - **New `backend.type` field in `kubernetes-nodes`.** The single-backend "kubevirt-talos" shape from Phase 2 grows a discriminator: `kubevirt-talos`, `cloud-talos-hetzner`, `cloud-talos-azure`, etc. Per-backend sub-charts realize the actual lifecycle (CAPI for KubeVirt-VM backends; `cluster-autoscaler` directly against the cloud's native API for cloud backends). |
Summary
Placeholder for Phase 3 of the kubernetes-application reshape: workers in environments outside the Cozystack management cluster (cloud autoscaling against Hetzner/Azure/AWS/GCP, BYO clusters, bare-metal/on-prem workers).
Held in draft pending PR #8 (Phase 1 + Phase 2: Talos migration + package split). Phase 3 depends on the architectural seam delivered by PR #8 and on operational experience from Phase 1+2.
This proposal does not commit to any specific shape for Phase 3. It documents the intended scope, the reasons for deferral, and a set of non-committal sketches collected during early discussion of PR #8 — so the design conversation does not restart from zero when work resumes.
Test plan
This is a placeholder proposal — no implementation, no tests. Implementation testing will be scoped when this proposal is filled in.