Skip to content

FinOps Hub private deployment fails under 'Subnets should be private' policy / Sept 2025 default-outbound retirement #2161

@MSBrett

Description

@MSBrett

Summary

FinOps Hub private deployments (enablePublicAccess: false) fail in environments enforcing the Azure built-in policy "Subnets should be private" (policy definition fe505f54d90b47b3b60de101). That policy denies any subnet whose defaultOutboundAccess is not explicitly set to false, which becomes the platform default after the Sept 2025 retirement of implicit subnet outbound (see https://aka.ms/defaultoutboundaccess).

Repro

  1. Assign the built-in policy "Subnets should be private" at subscription or management-group scope.
  2. Deploy the FinOps Hub template with enablePublicAccess: false and any address prefix.
  3. Deployment fails with RequestDisallowedByPolicy on Microsoft.Network/virtualNetworks/<hub>-vnet-<region>:
Resource '<hub>-vnet-<region>' was disallowed by policy.
Policy identifiers: 'Subnets should be private' (fe505f54d90b47b3b60de101)

Root cause

src/templates/finops-hub/modules/Microsoft.FinOpsHubs/Core/infrastructure.bicep defines three subnets (private-endpoint-subnet, script-subnet, dataExplorer-subnet) without setting defaultOutboundAccess: false. The hub's NSG has an explicit AllowInternetOutBound rule (priority 200), so the template was designed around implicit subnet outbound — which the policy and the upcoming platform default no longer permit.

Why the trivial fix is insufficient

Naively adding defaultOutboundAccess: false to the subnets satisfies the policy but breaks the hub because the private deployment depends on outbound internet from two subnets:

Subnet Outbound dependency Why
script-subnet mcr.microsoft.com Microsoft.Resources/deploymentScripts runs ACI containers using the azuredeploymentscripts-powershell image. Microsoft does not expose a private alternative for this image and the schema has no containerSettings.image override (verified against the ARM schema and the AVM deployment-script module).
dataExplorer-subnet raw.githubusercontent.com The config_InitializeHub ADF pipeline runs four .set-or-replace KQL commands against the ADX cluster using externaldata() to pull PricingUnits.csv, Regions.csv, ResourceTypes.csv, and Services.csv from the toolkit's open-data folder on GitHub. ADX makes the outbound HTTPS call directly.

private-endpoint-subnet is inbound-only (hosts private endpoints) and does not need outbound.

Explicit egress is therefore mandatory in private mode. The supported pattern per Microsoft docs for ACI-in-VNet is a NAT Gateway with a Standard Public IP (https://learn.microsoft.com/azure/container-instances/container-instances-virtual-network-concepts).

Proposed fix

  1. Set defaultOutboundAccess: false on all three subnets.
  2. Add a Standard NAT Gateway + Standard static Public IP to the hub template, attached to script-subnet and dataExplorer-subnet. Both gated on the existing private-routing flag so public deployments are unaffected.
  3. While here: introduce a tri-state network mode parameter (public / vnet / private) that replaces the binary enablePublicAccess boolean (kept as a deprecated back-compat shim). The new middle vnet mode deploys only the free VNet + NSG scaffold (no NAT, no PE) so customers can stage a future private upgrade without paying for unused egress. The portal UI (createUiDefinition.json) is updated to surface this via an OptionsGroup with per-mode notes documenting cost and downgrade behavior.
  4. Fix six pre-existing tag-namespace bugs in infrastructure.bicep where Microsoft.Storage/* was used for network resources that should be Microsoft.Network/*, and add the four new network resource types to the portal's TagsByResource resource list.

Notes

  • Mode downgrades (private → vnet or private → public) do not delete orphaned NAT Gateway / Public IP / private endpoints because deployments are incremental by default. This is documented in the new param description and in the portal UI privateModeNote.
  • The new vnet mode delivers a free VNet scaffold only. Storage / Key Vault / ADX retain public endpoints in this mode; this is documented in the UI vnetModeNote so customers don't mistake it for private isolation.
  • Verified end-to-end in three independent Azure deployments (public, vnet, private). Three rubber-duck reviews (gpt-5.5, claude-opus-4.6, lark) converged on remediation findings that are included in the PR.

Related

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions