ROX-35000: Add optional OpenShift Virtualization support to openshift-4 flavor by vikin91 · Pull Request #1855 · stackrox/infra

vikin91 · 2026-06-10T10:27:57Z

Add install-virt, vm-os, and virt-node-dedicated parameters to the openshift-4 (and openshift-4-perf-scale) flavor so users can optionally get a cluster with OpenShift Virtualization, a dedicated virt worker node, and a downloadable VM access artifact.

When install-virt=true, three conditional post-install steps run after cluster creation:

add-virt-node — clones a deterministically selected worker MachineSet into an n2-standard-8 virt-labeled node (optionally tainted when virt-node-dedicated=true)
install-virt-operator — installs OpenShift Virtualization via OLM with VSOCK and KVM_EMULATION enabled
create-vm — deploys a VM from quay.io/rhacs-eng/vm-images using the selected vm-os (rhel9 or rhel10), generates a random password, and writes a downloadable vm-access artifact with SSH credentials

When install-virt=false (the default), existing behavior is unchanged.

Usage examples

# With virt (shared node)
infractl create openshift-4 my-virt-cluster \
  --arg install-virt=true \
  --arg vm-os=rhel9 \
  --lifespan 8h

# With virt (dedicated, tainted node)
infractl create openshift-4 my-virt-dedicated \
  --arg install-virt=true \
  --arg vm-os=rhel9 \
  --arg virt-node-dedicated=true \
  --lifespan 8h

# Without virt (default, unchanged behavior)
infractl create openshift-4 my-cluster --lifespan 4h

How I validated my change

Unit tests
Parameter parity validation (CheckWorkflowEquivalence): extracted workflow and flavor parameter names, diffed — both openshift-4 and openshift-4-perf-scale match
E2E: create a cluster with install-virt=true and verify the virt node, operator, and VM are running, and the vm-access artifact is downloadable

AI-Assisted: cursor, ~90% generated from Superpowers implementation plan, user reviewed the plan twice.

Co-authored-by: Cursor <cursoragent@cursor.com>

…rtifact Co-authored-by: Cursor <cursoragent@cursor.com>

…artifact Co-authored-by: Cursor <cursoragent@cursor.com>

coderabbitai · 2026-06-10T10:28:07Z

📝 Walkthrough

Summary by CodeRabbit

New Features
- Added support for installing OpenShift Virtualization on OpenShift clusters.
- Introduced configurable VM operating system selection.
- Added dedicated virtual machine node support with optional taint configuration.
- Enabled VM access credentials and connection details via new artifacts.

Walkthrough

This PR adds OpenShift Virtualization (KubeVirt) VM provisioning capability to the infra server. It exposes virtualization parameters in both openshift-4 flavors, implements a three-stage workflow that provisions virt nodes, installs the HyperConverged operator, and creates KubeVirt VirtualMachines with VSOCK and registry pull secret integration.

Changes

OpenShift Virtualization (KubeVirt) provisioning

Layer / File(s)	Summary
Flavor virtualization parameters and artifacts `chart/infra-server/static/flavors.yaml`	Both `openshift-4` and `openshift-4-perf-scale` flavors expose `install-virt`, `vm-os`, and `virt-node-dedicated` parameters, plus declare a new `vm-access` artifact for VM credentials and access commands.
Workflow parameters and conditional orchestration `chart/infra-server/static/workflow-openshift-4.yaml`	Workflow adds input parameters mirroring flavor options, a `registry-pull-secret` volume from a Kubernetes secret, and conditional steps in the `start` template that execute virt setup templates only when `install-virt` is true.
Virt node provisioning template `chart/infra-server/static/workflow-openshift-4.yaml`	The `add-virt-node` template deterministically creates or reuses a virt-dedicated worker MachineSet from an existing worker, optionally applies a NoSchedule taint, and waits for virt-role nodes to reach Ready with timeout and failure diagnostics.
Virtualization operator installation template `chart/infra-server/static/workflow-openshift-4.yaml`	The `install-virt-operator` template installs HyperConverged operator components (Namespace/OperatorGroup/Subscription), waits for operator CSV and HyperConverged health conditions, then patches subscription to enable `KVM_EMULATION`.
VM creation and access provisioning template `chart/infra-server/static/workflow-openshift-4.yaml`	The `create-vm` template validates `vm-os`, generates VM passwords, creates a registry pull secret in `openshift-cnv` namespace, applies a VSOCK-enabled VirtualMachine with optional node placement and tolerations, waits for VMI readiness, and writes VM access details to artifact.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and accurately summarizes the main change: adding optional OpenShift Virtualization support to the openshift-4 flavor with specific parameters.
Description check	✅ Passed	The description is well-related to the changeset, explaining the new parameters, the three conditional workflow steps, usage examples, and validation approach.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch piotr/ROX-35000-VM

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

rhacs-bot · 2026-06-10T10:36:58Z

A single node development cluster (infra-pr-1855) was allocated in production infra for this PR.

CI will attempt to deploy quay.io/rhacs-eng/infra-server: to it.

🔌 You can connect to this cluster with:

gcloud container clusters get-credentials infra-pr-1855 --zone us-central1-a --project acs-team-temp-dev

🛠️ And pull infractl from the deployed dev infra-server with:

nohup kubectl -n infra port-forward svc/infra-server-service 8443:8443 &
make pull-infractl-from-dev-server

🚲 You can then use the dev infra instance e.g.:

bin/infractl -k -e localhost:8443 whoami

⚠️ Any clusters that you start using your dev infra instance should have a lifespan shorter then the development cluster instance. Otherwise they will not be destroyed when the dev infra instance ceases to exist when the development cluster is deleted. ⚠️

Further Development

☕ If you make changes, you can commit and push and CI will take care of updating the development cluster.

🚀 If you only modify configuration (chart/infra-server/configuration) or templates (chart/infra-server/{static,templates}), you can get a faster update with:

make helm-deploy

Logs

Logs for the development infra depending on your @redhat.com authuser:

Or:

kubectl -n infra logs -l app=infra-server --tail=1 -f

vikin91 · 2026-06-10T11:19:42Z

@coderabbitai full review

coderabbitai · 2026-06-10T11:19:50Z

✅ Action performed

Full review finished.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@chart/infra-server/static/workflow-openshift-4.yaml`:
- Around line 591-611: The loop waiting for VMI (using VM_NAME, NAMESPACE,
PHASE, timeout, elapsed) currently only logs a warning on timeout and exits 0;
change it so that when elapsed >= timeout you first persist any needed debugging
artifacts (e.g., oc get vmi -o yaml for "$VM_NAME" and logs) and then exit
non-zero to fail the virt path (use a descriptive error log and exit 1). Locate
the while-true block that checks PHASE=$(oc get vmi "$VM_NAME" -n "$NAMESPACE"
-o jsonpath='{.status.phase}') and replace the timeout branch to collect/persist
debug output before calling exit 1 so the workflow fails instead of succeeding.
- Around line 493-519: The script currently generates VM_PASSWORD and rewrites
vm-access.md before checking whether the VM already exists, causing
stale/unapplied credentials to be published; change the flow in the VM creation
logic so that VM_PASSWORD is only generated and vm-access.md is only
created/overwritten after confirming the VM does not already exist (i.e., move
the VM_PASSWORD assignment and the vm-access.md write block to run inside the
branch where the VM will be created), and if you must support resume/retry, read
any existing credential artifact or the VM/secret first and skip
regenerating/writing when credentials already exist.
- Around line 444-455: After patching KVM_EMULATION in install-virt-operator,
add a reconciliation wait before the create-vm step: poll the HyperConverged
resource (e.g., name hyperconverged-cluster or the HCO resource used in this
flow) in namespace openshift-cnv and loop until its status.conditions show
Available=True, Progressing=False and Degraded=False (and/or confirm the
subscription/env update has taken effect), with a timeout and backoff; only
proceed to the create-vm step when those health checks pass to ensure
KubeVirt/HCO has reconciled the new KVM_EMULATION setting.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 656e2693-b4ae-4d7e-9497-ae0c3ec2dad1

📥 Commits

Reviewing files that changed from the base of the PR and between 99a66e9 and a5cc60f.

📒 Files selected for processing (2)

chart/infra-server/static/flavors.yaml
chart/infra-server/static/workflow-openshift-4.yaml

coderabbitai · 2026-06-10T11:26:49Z

+          # Patch KVM_EMULATION
+          current_kvm=$(oc get subscription kubevirt-hyperconverged -n openshift-cnv -o jsonpath='{.spec.config.env[?(@.name=="KVM_EMULATION")].value}' 2>/dev/null || echo "")
+          if [ "$current_kvm" = "true" ]; then
+            echo "KVM_EMULATION already set"
+          else
+            echo "Patching subscription with KVM_EMULATION..."
+            oc patch subscription kubevirt-hyperconverged \
+              -n openshift-cnv \
+              --type=merge \
+              -p '{"spec":{"config":{"selector":{"matchLabels":{"name":"hyperconverged-cluster-operator"}},"env":[{"name":"KVM_EMULATION","value":"true"}]}}}'
+            echo "KVM_EMULATION patched"
+          fi


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In OpenShift Virtualization / HyperConverged Operator, after setting KVM_EMULATION=true through the Subscription config, what condition should be waited on before creating a VM?

💡 Result:

When you update the Subscription for the HyperConverged Operator (HCO) to include KVM_EMULATION=true, the operator reconciles this change by updating the underlying KubeVirt configuration [1][2]. You should wait for the HyperConverged Cluster Operator (HCO) to complete its reconciliation and for the KubeVirt custom resource (CR) to reflect the updated configuration [2]. Specifically, you should verify that the HCO is in a ready state and that the KubeVirt operand has successfully processed the new configuration [2]. While there is no single "KVM_EMULATION_READY" status condition to watch for, you can confirm the state by ensuring the HyperConverged operator pod(s) have finished reconciling and checking the KubeVirt status or associated virt-handler logs to ensure the system is correctly identifying the emulation mode (or lack of KVM) [3][4][5]. In many environments, these changes are applied immediately by the operator [2].

Citations:

1: https://github.com/kubevirt/hyperconverged-cluster-operator/blob/main/deploy/kustomize/kvm_emulation/subscription.patch.yaml

2: https://patriciocerda.com/?p=1629

3: https://access.redhat.com/solutions/7092883

4: kubevirt/kubevirt@0fb325a

5: https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=1992608

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="chart/infra-server/static/workflow-openshift-4.yaml" ls -l "$FILE" sed -n '1,220p' "$FILE" | nl -ba | head -n 220 echo "----" sed -n '220,420p' "$FILE" | nl -ba | head -n 200 echo "----" sed -n '420,520p' "$FILE" | nl -ba echo "----" # Find where create-vm template defined and where it is invoked rg -n "template: create-vm|name: create-vm|create-vm" "$FILE" rg -n "KVM_EMULATION" "$FILE" rg -n "install-virt-operator|template: install-virt-operator|name: install-virt-operator" "$FILE"

Repository: stackrox/infra

Length of output: 221

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="chart/infra-server/static/workflow-openshift-4.yaml" echo "== file ==" wc -l "$FILE" sed -n '1,140p' "$FILE" | nl -ba echo "== mid ==" sed -n '140,340p' "$FILE" | nl -ba echo "== end region around patch ==" sed -n '400,520p' "$FILE" | nl -ba echo "== rg sections ==" rg -n "KVM_EMULATION|create-vm|install-virt-operator|create-vm" "$FILE"

Repository: stackrox/infra

Length of output: 186

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="chart/infra-server/static/workflow-openshift-4.yaml" echo "== file ==" ls -l "$FILE" wc -l "$FILE" echo "== head (1-160) ==" sed -n '1,160p' "$FILE" | nl -ba echo "== mid (160-360) ==" sed -n '160,360p' "$FILE" | nl -ba echo "== around patch (420-510) ==" sed -n '420,510p' "$FILE" | nl -ba echo "== around templates (scan more) ==" sed -n '360,440p' "$FILE" | nl -ba echo "== rg matches ==" rg -n "KVM_EMULATION|create-vm|install-virt-operator|create-vm:|template: create-vm|template: install-virt-operator|name: create-vm|name: install-virt-operator" "$FILE"

Repository: stackrox/infra

Length of output: 307

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="chart/infra-server/static/workflow-openshift-4.yaml" echo "== file ==" ls -l "$FILE" wc -l "$FILE" echo "== head (1-160) ==" sed -n '1,160p' "$FILE" | nl -ba echo "== mid (160-360) ==" sed -n '160,360p' "$FILE" | nl -ba echo "== around patch (420-510) ==" sed -n '420,510p' "$FILE" | nl -ba echo "== around templates (360-440) ==" sed -n '360,440p' "$FILE" | nl -ba echo "== rg matches ==" rg -n "KVM_EMULATION|create-vm|install-virt-operator|template: create-vm|template: install-virt-operator|name: create-vm|name: install-virt-operator" "$FILE"

Repository: stackrox/infra

Length of output: 307

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="chart/infra-server/static/workflow-openshift-4.yaml" echo "== rg matches ==" rg -n "template: create-vm|name: create-vm|template: install-virt-operator|name: install-virt-operator|KVM_EMULATION|create-vm" "$FILE" || true echo "== patch area (420-520) with line numbers via awk ==" sed -n '420,520p' "$FILE" | awk '{printf "%d\t%s\n", NR+419, $0}' echo "== earlier context (360-440) ==" sed -n '360,440p' "$FILE" | awk '{printf "%d\t%s\n", NR+359, $0}' echo "== find where create-vm is called/invoked (search for 'template: create-vm' blocks) ==" rg -n "template:\s*create-vm" "$FILE" || true rg -n "name:\s*create-vm" "$FILE" || true

Repository: stackrox/infra

Length of output: 10137

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="chart/infra-server/static/workflow-openshift-4.yaml" # Print the workflow steps section where install-virt-operator and create-vm are invoked sed -n '1,120p' "$FILE" | awk '{printf "%d\t%s\n", NR, $0}' echo "----" # Also print around the invocation list lines ~50-90 if needed sed -n '40,95p' "$FILE" | awk '{printf "%d\t%s\n", NR, $0}'

Repository: stackrox/infra

Length of output: 5122

Add a post-KVM_EMULATION reconciliation wait before create-vm.

install-virt-operator patches KVM_EMULATION at the end (lines ~444–455) and the workflow then immediately runs create-vm (lines ~67–74 in the start steps). Add a second wait after the patch to ensure the HyperConverged Operator/KubeVirt has reconciled the updated subscription config (there’s no single KVM_EMULATION_READY condition to watch; re-check HCO’s health conditions like Available=True, Progressing=False, Degraded=False after the patch, and/or verify the updated operand state).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@chart/infra-server/static/workflow-openshift-4.yaml` around lines 444 - 455, After patching KVM_EMULATION in install-virt-operator, add a reconciliation wait before the create-vm step: poll the HyperConverged resource (e.g., name hyperconverged-cluster or the HCO resource used in this flow) in namespace openshift-cnv and loop until its status.conditions show Available=True, Progressing=False and Degraded=False (and/or confirm the subscription/env update has taken effect), with a timeout and backoff; only proceed to the create-vm step when those health checks pass to ensure KubeVirt/HCO has reconciled the new KVM_EMULATION setting.

coderabbitai · 2026-06-10T11:26:49Z

+          VM_PASSWORD=$(openssl rand -hex 10)
+
+          echo "=== Creating VM: $VM_NAME (OS: $VM_OS, dedicated=$DEDICATED) ==="
+
+          # Copy the already-provisioned Quay pull secret from the infra namespace into the target cluster
+          echo "Creating pull secret for quay.io/rhacs-eng in namespace $NAMESPACE from mounted infra secret..."
+          if [ ! -s /infra-secrets/quay/.dockerconfigjson ]; then
+            echo "ERROR: mounted registry secret /infra-secrets/quay/.dockerconfigjson is missing or empty"
+            exit 1
+          fi
+          cat <<EOFSECRET | oc apply -f -
+          apiVersion: v1
+          kind: Secret
+          metadata:
+            name: ${PULL_SECRET_NAME}
+            namespace: ${NAMESPACE}
+          type: kubernetes.io/dockerconfigjson
+          data:
+            .dockerconfigjson: $(base64 < /infra-secrets/quay/.dockerconfigjson | tr -d '\n')
+          EOFSECRET
+          echo "Pull secret created"
+
+          # Check if VM already exists
+          if oc get vm "$VM_NAME" -n "$NAMESPACE" &>/dev/null; then
+            STATUS=$(oc get vm "$VM_NAME" -n "$NAMESPACE" -o jsonpath='{.status.printableStatus}' 2>/dev/null || echo "Unknown")
+            echo "VM $VM_NAME already exists (status: $STATUS)"
+          fi


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't regenerate exported credentials when the VM already exists.

VM_PASSWORD is generated before the existence check, and vm-access.md is always rewritten. On retries or resumed workflows, the artifact can publish a password that was never applied to the guest.

Also applies to: 624-647

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@chart/infra-server/static/workflow-openshift-4.yaml` around lines 493 - 519, The script currently generates VM_PASSWORD and rewrites vm-access.md before checking whether the VM already exists, causing stale/unapplied credentials to be published; change the flow in the VM creation logic so that VM_PASSWORD is only generated and vm-access.md is only created/overwritten after confirming the VM does not already exist (i.e., move the VM_PASSWORD assignment and the vm-access.md write block to run inside the branch where the VM will be created), and if you must support resume/retry, read any existing credential artifact or the VM/secret first and skip regenerating/writing when credentials already exist.

coderabbitai · 2026-06-10T11:26:49Z

+          # Wait for VMI to be ready
+          echo "Waiting for VMI $VM_NAME to be ready..."
+          timeout=300
+          elapsed=0
+          while true; do
+            PHASE=$(oc get vmi "$VM_NAME" -n "$NAMESPACE" -o jsonpath='{.status.phase}' 2>/dev/null || echo "")
+            if [ "$PHASE" = "Running" ]; then
+              echo "VMI is Running"
+              break
+            fi
+            if [ $elapsed -ge $timeout ]; then
+              echo "WARNING: VMI did not reach Running phase after ${timeout}s (current: $PHASE)"
+              echo "VM was created but may still be starting"
+              break
+            fi
+            if [ $((elapsed % 30)) -eq 0 ] && [ $elapsed -gt 0 ]; then
+              echo "Waiting for VMI (phase: ${PHASE:-Pending}, ${elapsed}s)..."
+            fi
+            sleep 10
+            elapsed=$((elapsed + 10))
+          done


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail the virt path when the VMI never reaches Running.

This timeout only logs a warning and exits successfully, so image-pull/scheduling/operator failures still leave the workflow in its normal suspended state. If you need the artifact for debugging, persist it first and then exit non-zero on timeout.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@chart/infra-server/static/workflow-openshift-4.yaml` around lines 591 - 611, The loop waiting for VMI (using VM_NAME, NAMESPACE, PHASE, timeout, elapsed) currently only logs a warning on timeout and exits 0; change it so that when elapsed >= timeout you first persist any needed debugging artifacts (e.g., oc get vmi -o yaml for "$VM_NAME" and logs) and then exit non-zero to fail the virt path (use a descriptive error log and exit 1). Locate the while-true block that checks PHASE=$(oc get vmi "$VM_NAME" -n "$NAMESPACE" -o jsonpath='{.status.phase}') and replace the timeout branch to collect/persist debug output before calling exit 1 so the workflow fails instead of succeeding.

vikin91 and others added 3 commits June 10, 2026 11:21

feat(ocp4-virt): add virt parameters and flavor metadata

4f18cae

Co-authored-by: Cursor <cursoragent@cursor.com>

feat(ocp4-virt): add optional virtualization workflow and VM access a…

7af0e8b

…rtifact Co-authored-by: Cursor <cursoragent@cursor.com>

fix(ocp4-virt): use jq for node readiness and always write vm-access …

a5cc60f

…artifact Co-authored-by: Cursor <cursoragent@cursor.com>

coderabbitai Bot reviewed Jun 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ROX-35000: Add optional OpenShift Virtualization support to openshift-4 flavor#1855

ROX-35000: Add optional OpenShift Virtualization support to openshift-4 flavor#1855
vikin91 wants to merge 3 commits into
masterfrom
piotr/ROX-35000-VM

vikin91 commented Jun 10, 2026

Uh oh!

coderabbitai Bot commented Jun 10, 2026 •

edited

Loading

Summary by CodeRabbit

Walkthrough

Changes

Uh oh!

rhacs-bot commented Jun 10, 2026

Uh oh!

vikin91 commented Jun 10, 2026

Uh oh!

coderabbitai Bot commented Jun 10, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 10, 2026

Uh oh!

coderabbitai Bot Jun 10, 2026

Uh oh!

coderabbitai Bot Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vikin91 commented Jun 10, 2026

Usage examples

How I validated my change

Uh oh!

coderabbitai Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Uh oh!

rhacs-bot commented Jun 10, 2026

Further Development

Logs

Uh oh!

vikin91 commented Jun 10, 2026

Uh oh!

coderabbitai Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented Jun 10, 2026 •

edited

Loading

coderabbitai Bot commented Jun 10, 2026 •

edited

Loading