docs(spec): Bicep-less Foundry agent init design (#8065)#8577
Conversation
Design spec for RFC #8065 — make `azd ai agent init` Bicep-less by default with the `azure.ai.agents` extension owning provisioning. Adopts the compromise of explicit `infra.provider: azure.ai.agents` in azure.yaml (per PR #7482's custom provisioning provider framework), deferring service-host-driven auto-routing to v0.3+. Covers in-memory synthesis, on-disk reuse after eject, brownfield `resourceId:` flow, 5-step validation pipeline, and the small Core changes required to surface `uses` / `runtime` on the extension-facing `ServiceConfig`.
🔗 Linked Issue RequiredThanks for the contribution! Please link a GitHub issue to this PR by adding |
There was a problem hiding this comment.
Pull request overview
Adds a new design spec documenting the proposed “Bicep-less by default” experience for azd ai agent init, where the azure.ai.agents extension owns provisioning via a custom provisioning provider and can optionally “eject” generated Bicep to ./infra/.
Changes:
- Introduces a detailed design spec for extension-owned provisioning + in-memory Bicep synthesis.
- Defines activation/eject behavior, validation pipeline, and post-eject drift/UX expectations.
- Outlines required (future) Core schema/proto surfacing for
usesand a service-levelruntime.
Review-pass revisions on docs/specs/bicepless-foundry/spec.md: - Fix Core Changes #1: Uses already exists on ServiceConfig (service_config.go:58) and the v1 schema (azure.yaml.json:234); the gap is proto-only. Runtime remains the larger gap. Rewrote the section narrative and trimmed the file-change table accordingly. - Fix On-Disk Reuse table: azdFileShareUploadOperations is at infra/provisioning/manager.go:125, not :121. Disambiguated all four rows with full paths and 'call'/'gate' suffix. - Fix Core Changes mapper range: mapper_registry.go:102-162 (was :139-161, which was a sub-slice). - Remove all v0.2/v0.3+/0.1.x/0.2.x version markers; the spec doesn't own a release schedule. - Trim Problem, Solution, Provider Resolution, Explicit Declaration, Brownfield, In-Memory Synthesis, and Embedded Templates sections; remove duplicate eject example; consolidate Post-Eject trade paragraph. - Open Questions: drop --preview-bicep entry, add one-line proposals to the remaining two.
…ecision) - Preview: replace 'Same as Deploy with validationOnly mode' with 'ARM What-If, mirrors Core's Bicep provider'. validationOnly hits ARM's /validate endpoint and returns template-validity errors; What-If hits /whatIf and returns a real change diff. Core's bicep provider uses WhatIfDeployToResourceGroup (cli/azd/pkg/infra/scope.go:132); the spec now matches. - pathHasModule row: clarify that os.ReadDir returns NotExist on missing ./infra/, and the caller's 'err == nil && moduleExists' guard is what falls through. Prior wording 'returns false' was imprecise.
…y host, nested agents) Adopts the consolidated YAML shape from therealjohn/foundry-azd-config-preview/REFERENCE.md and trims the spec accordingly. Shape changes: - Host: azure.ai.project + azure.ai.agent (two services, uses: link) -> microsoft.foundry (single service with nested agents[]). - Provider name: azure.ai.agents -> microsoft.foundry (matches host kind, reads like an engine next to bicep/terraform). - Brownfield signal: resourceId (ARM ID) -> endpoint (URL, matches Portal/CLI UX). - Deploy modes: added image: as third option alongside docker:/runtime:. Scope tightening: - azd deploy explicitly out of scope (agent code push and data-plane reconciliation are deploy's job). - Data-plane fields (connections, toolboxes, skills, routines, agent-level tools/skill, $ref) silently ignored by synthesizer; new field-skip table makes this explicit. - Coexistence with non-Foundry services out of scope; infra.layers[] noted as escape hatch. Core changes collapsed: - Removed "Surface uses/runtime to extensions" as a Core ask. With nested agents[], the runtime is inside the service body (read via additional_properties); no proto/struct/mapper plumbing needed. - Down to two Core changes: relax infra.provider enum, and the deferred auto-install (#7502). Validation pipeline rewritten to match the new invariants. Per-agent deploy-mode check now allows exactly one of docker/runtime/image. Brownfield validation checks endpoint URL shape. Foundry server-side templating syntax pass-through made explicit.
- Remove Core Changes section on auto-install (#7502 — already delivered by #7482; not a gap, not in our scope). - Drop Open Question 2 (schema branch ownership with #7962). It was a coordination artifact, not a real dependency. - Drop #7962 and #8049 References entries. - Drop the forward reference to `azd ai agent add monitoring` (per #8049) from Embedded templates — monitoring is out of scope. - Drop the auto-install Risks row. Keeps #7962 and #8049 only as one-line out-of-scope pointers in the Scope section.
Spec reviewStrong, well-grounded spec. I verified the code citations against The three questions you flagged1. Is the data-plane/ARM split in "What the synthesizer ignores" drawn correctly? 2. Drift detection on 3. Eject delete-and-rerun with no Other findings
|
|
@glharper Thanks for the careful review — all 8 points hold up and have been applied to the spec:
|
…schema, scope) All 8 substantive points from the maintainer review applied: - Split "What synthesizer ignores" into two tables: read-for-branching (docker/runtime/image, needed by validation step 3 and ARM branching) vs not-read-at-all (data-plane). Resolves the runtime contradiction. - Open Question 1 flipped from "no detection" to warn-on-Deploy(). Pseudocode + method-table row now describe the in-memory diff against on-disk Bicep. - Eject UX now matches azd infra generate (cmd/infra_generate.go:204-210): interactive overwrite prompt, --no-prompt keeps hard-refuse for CI. Post-Eject CLI table and Accepted-trade paragraph updated. - Destroy row spells out soft-delete purge of Cognitive Services accounts to mirror Core's bicep provider (bicep_provider.go:1283-1413). Without this, up -> down -> up under the same name fails. - Schema relaxation now pattern: ^[a-z0-9.]+$ + examples, not examples alone. Keeps typo catching for all users. - Brownfield section + Preview row: both Deploy and Preview now resolve endpoint -> ARM ID + target scope before invoking ARM. Preview can't run on a brownfield project without scope resolution. - Telemetry section names docs/reference/telemetry-data.md as an implementation-PR deliverable per cli/azd/AGENTS.md:246-249. - infra.layers[] escape hatch verified inline: InfraLayer.Provider field (provisioning/provider.go:57 -> :40) + ParseProvider accepts any string. - Stability Contract tightened from "semantically identical" to "byte-stable within a patch extension version / byte-identical Bicep," matching the Test Plan's byte-equal standard. Test Plan picked up entries for each new behavior: schema pattern, eject overwrite prompt, post-eject Deploy() drift warn, brownfield Preview scope, and an expanded init -> provision -> down -> provision E2E for soft-delete purge.
|
Thanks for the careful review — all 8 points hold up and have been applied to the spec:
Test Plan also picked up entries for each new behavior. Pushing the commit shortly. |
Design spec for RFC #8065 — make
azd ai agent initBicep-less by default, with theazure.ai.agentsextension owning provisioning via a custom provider namedmicrosoft.foundry.Summary
Moves infrastructure templates from
Azure-Samples/azd-ai-starter-basicinto theazure.ai.agentsextension binary.azd ai agent initproduces onlyazure.yamland an agent code project — noinfra/directory. Atazd provisiontime, the extension's own provisioning provider synthesizes Bicep in memory fromazure.yamland applies it.azd ai agent init --infraejects on demand: same synthesis, written to./infra/; subsequent provisions read from disk.The
azure.yamlshape is fixed by the Foundryazure.yamlreference: a singlehost: microsoft.foundryservice per project, with nestedagents:,deployments:,connections:, etc. This spec only changes how that file is provisioned; it does not redesign the YAML.Key design decisions
host: microsoft.foundry— one consolidated service per Foundry project, with nestedagents[],deployments[],connections[],toolboxes[],skills[],routines[]. Matches the reference doc.infra.provider: microsoft.foundry— explicit declaration required until service-host-driven auto-routing lands. Reuses the custom provisioning provider framework from feat: Add provisioning provider support to extension framework #7482 (merged).endpoint:URL — not ARM resource ID. Matches what Portal andazCLI show; deploy verb resolves ARM ID from endpoint when it needs control-plane access.docker:,runtime:, orimage:(pre-built container). Exactly one is required; validator rejects two-or-none../infra/when present, synthesizes otherwise.azure.yamlis never mutated by eject.--forceflag for eject. To regenerate, the user deletes./infra/and re-runs the command.Previewvia ARM What-If — mirrors Core's Bicep provider (scope.go:132usesWhatIfDeployToResourceGroup).Scope
In scope: Bicep-less default behavior, eject command, embedded templates, ARM-backed synthesis only (Foundry project + model deployments + ACR when needed), schema relaxation to allow extension-named providers in
infra.provider.Out of scope:
azd deploy— agent code push and data-plane reconciliation (connections, toolboxes, skills, routines, agent definitions) are deploy's job, not provisioning's. This spec ends at "ARM resources are in place."connections:,toolboxes:,skills:,routines:, agent-leveltools:/skill:,$ref:resolution. Synthesizer reads them only to skip them; deploy verb owns them.infra.provider:. Defers to a future spec.azure.yamlschema (Unify Foundry agent configuration in azure.yaml #7962); incremental composition (Add connections, models, tools, and skills to Foundry Agent projects after init #8049); coexistence with non-Foundry services (useinfra.layers[]as escape hatch).Core changes collapsed. The original RFC asked Core to surface
services.<svc>.usesand a typedservices.<svc>.runtimeon the extension-facing proto. With nestedagents[], runtime lives inside the service body (read viaadditional_properties); no proto/struct/mapper plumbing needed. Down to one Core change: relax theinfra.providerenum.Related
Notes for reviewers
Doc-only PR. Adds
docs/specs/bicepless-foundry/spec.md. No code changes — the spec is the implementation contract; code PRs follow.Particular attention welcomed on:
Deploy()).--forceflag — does this match azd's broader UX posture?