bundle: read deployment state from DMS (experimental.record_deployment_history)#5355
bundle: read deployment state from DMS (experimental.record_deployment_history)#5355shreyas-goenka wants to merge 29 commits into
Conversation
The deadcode lint check flagged this as unreachable on PR #5355's CI. ManagedState returned (string, bool) but no caller existed — every consumer uses IsManagedState (the bool wrapper). Drop it; reintroduce only if a caller needs the raw value. Co-authored-by: Isaac
|
Commit: f50dfc0
23 interesting tests: 15 SKIP, 7 KNOWN, 1 flaky
Top 44 slowest tests (at least 2 minutes):
|
0e706b8 to
71a41d9
Compare
71a41d9 to
3a3613e
Compare
1e8ba45 to
76bf017
Compare
3a3613e to
825f83a
Compare
2faa22b to
063dad9
Compare
5833370 to
a9c126f
Compare
Introduce a StateReader interface with two implementations that populate the direct engine's resource-state DB: - file-based, reading the local resources.json (delegates to DeploymentState.Open, preserving WAL recovery + migration) - DMS-based, reading from the deployment metadata service via the SDK Bundle.ListResourcesAll This is a self-contained, drop-in abstraction with unit tests; it is not yet wired into the deploy path. Integration (selecting the reader by managed-state and sourcing the deployment ID) follows once the DMS lock and op-reporting PRs land. Co-authored-by: Isaac
Wire the StateReader into the direct-engine read path. NewStateReader selects the DMS reader when experimental.record_deployment_history is enabled and a prior deployment exists (lineage recorded in resources.json), otherwise the local file reader. process.go replaces the direct StateDB.Open call with the selector, so plan/deploy/summary read resource state from the deployment metadata service under the flag. The deployment ID is the state lineage (matching the lock package), read from the local resources.json; with no lineage yet (first deploy) there is nothing in DMS, so the local file reader is used. Co-authored-by: Isaac
When record_deployment_history is enabled on an existing direct deployment, the lineage is already in resources.json but DMS has no resources for it. Reading an empty set would make the plan re-create every resource. The DMS reader now keeps resources.json's resources until DMS has its own (recorded by the next deploy).
…sfully When record_deployment_history is on and resources.json has a lineage, DMS is treated as authoritative only if the deployment has a version that completed successfully. Otherwise — DMS not initialized for this deployment yet, or the initial DMS deploy failed — defer to the local file so existing resources are neither re-created nor lost. Replaces the earlier empty-resource-set heuristic.
…ersion Add CreateVersion/CompleteVersion/ListVersions handlers so tests can mark a deployment's version completed, and have the dms/read test record a successful version so DMS is the authoritative source for the read.
| // | ||
| // Deployment state has two parts: | ||
| // | ||
| // - Identity: the lineage (the deployment id) and serial. These always live in |
There was a problem hiding this comment.
If we integrate with the worksapce asset platform then the identity will no longer come from resources.json. It'll instead from directly from workspace.state_path configured for the bundle - which would point to a deployment entity.
| // did not complete successfully, DMS state is absent or partial and callers | ||
| // should fall back to the local file. | ||
| func deploymentHasSuccessfulVersion(ctx context.Context, client sdkbundle.BundleInterface, deploymentID string) (bool, error) { | ||
| versions, err := client.ListVersionsAll(ctx, sdkbundle.ListVersionsRequest{ |
There was a problem hiding this comment.
Why do we do this client-side and not server-side? Will this work when I have 10k versions deployed?
There was a problem hiding this comment.
The list API returns in reverse order w.r.t version ID (so latest version first).
But your point is completely fair. We should denormalize this into the deployment so we prevent this listing.
Do you think that is a blocker or can we followup once we have the API side changes? It'll be a relatively fast followup (needs a bit of design work)
There was a problem hiding this comment.
if this API is paginated, can we process it page by page?
|
|
||
| // readLocalDatabase parses the local resources.json file. A missing file yields | ||
| // an empty database (no lineage), which callers read as "nothing deployed yet". | ||
| func readLocalDatabase(path string) (dstate.Database, error) { |
There was a problem hiding this comment.
this looks like it reimplements StateDB.Open, why?
Address review: drop the StateReader interface and its readLocalDatabase duplicate of StateDB.Open. Open now takes a deployment-metadata-service client (nil = file-only) and overlays DMS resource state on top of the local identity (lineage/serial) when DMS owns the deployment. The version-completion gate and resource fetch move into the dstate package. Co-authored-by: Isaac
Address review: deploymentHasSuccessfulVersion no longer materializes the full version history. Versions are listed newest-first, so iterating page by page and stopping at the first successful version typically reads just one page even for deployments with thousands of versions. The resource fetch builds its map from the paginated iterator as well. Co-authored-by: Isaac
| // overlayDMSState replaces the file-derived resource state with the state | ||
| // recorded in the deployment metadata service, when DMS owns this deployment. | ||
| // The caller holds db.mu and has already populated db.Data from the file. | ||
| func (db *DeploymentState) overlayDMSState(ctx context.Context, client sdkbundle.BundleInterface) error { |
There was a problem hiding this comment.
We can add a serial based comparision and override here as a followup.
Git Bash rewrites leading-slash arguments like /api/2.0/... to C:/Program Files/Git/api/..., so the raw DMS api calls hit no stub on Windows. Prefix them with MSYS_NO_PATHCONV=1, matching acceptance/cmd/api. Co-authored-by: Isaac
Self-review follow-ups: restore the note that an authoritative DMS resource set is trusted even when empty (lost when the reader moved into dstate), and add a unit test asserting the version scan stops at the first success rather than consuming the full paginated list. Co-authored-by: Isaac
Collapse the four TestOpenWithDMS subtests into a table asserting the resulting key->ID map, and fold the standalone early-exit test into the version-gate table as a wantNexts column, dropping the iterator injection hook from the fake client. Co-authored-by: Isaac
Summary
Reads the direct engine's resource state from the deployment metadata service
(DMS) when
experimental.record_deployment_historyis enabled.This is a no-op unless the experimental flag is on. With the flag off, state is
read entirely from the local
resources.json, exactly as before.How it works
State has two parts: identity (lineage + serial, always from
resources.json) and resources (the deployed set). The lineage is the DMSdeployment id, so a later deploy must reuse it rather than mint a new one.
The DMS read is folded directly into
StateDB.Open— no separate readerabstraction:
dmsClient == nil→ file-only behavior (WAL recovery + migration), unchanged.dmsClient != niland the file has a lineage → DMS is the source of truth:Open keeps the file's identity but overlays the resource set read from DMS
(
ListResourcesAll, keys re-prefixed withresources.).DMS is trusted only once a version has completed successfully
(
deploymentHasSuccessfulVersion). If the flag was just enabled on an existingdirect deployment, or the initial DMS deploy failed, there is no successful
version yet, so Open falls back to the local file and existing resources are
neither re-created nor lost. The version check pages through
ListVersions(newest-first) and stops at the first success, so it does not fetch the whole
version history; denormalizing this onto the deployment to avoid the listing
entirely is a planned API-side follow-up.
cmd/bundle/utils/process.gopasses the client (b.WorkspaceClient(ctx).Bundle)only when the flag is set; reads open the state write-disabled, so no lineage is
minted on read.
Review notes
This revision addresses review feedback to drop the
StateReaderinterface andits
readLocalDatabase, which duplicatedStateDB.Open. The version-completiongate and resource fetch now live in
bundle/direct/dstate/dms.go.Tests
bundle/direct/dstate/dms_test.gocover the Open selection(DMS-owned vs file fallback vs nil client), the version-completion gate, and
the resource mapping.
acceptance/bundle/dms/readseeds a completed DMS version +resources and asserts the plan derives create/update/delete from DMS state.