Skip to content

feat: Refactor agent and hub sync for better readability#208

Open
maximbigler wants to merge 43 commits into
mainfrom
refactor/deploy-state-machine
Open

feat: Refactor agent and hub sync for better readability#208
maximbigler wants to merge 43 commits into
mainfrom
refactor/deploy-state-machine

Conversation

@maximbigler

@maximbigler maximbigler commented Jun 22, 2026

Copy link
Copy Markdown
Member

Type of change

  • 🐛 Bug fix
  • 🚀 New feature
  • ❓ Other (please specify): Refactor

Description

Fixes #172
Fixes #211

Refactors the repository-sync and application-deploy flow in internal/hub/applications to make it readable and to remove duplicated/divergent logic. No new feature; behavior is preserved except for the intentional changes called out below.

Application-level deploy state machine

  • Introduced updateApplicationStatus as the single chokepoint for every application status write + SSE publish; the four mark* helpers are now thin wrappers over it. Fixes a bug where markDeploymentInProgress published an SSE event twice on a DB error.
  • Extracted applyDeployResult (+ deployNotifications) — one shared interpreter of an agent's deploy outcome (err / nil result / !Success / success), replacing two divergent copies that previously lived in the sync path and the manual-deploy path.
  • Split Deployer.StartDeploy into dispatchDeploy (transport-only) and StartDeploy (dispatch + mark in-progress). The sync path uses dispatchDeploy via DeployAndWait because processSyncJob already owns the in-progress transition, so each path now marks the application in-progress exactly once instead of twice.
  • Rewrote processSyncJob to use linear control flow instead of the success bool + defer pattern.
  • Bug fix: terminal status writes now use context.Background() instead of the job context. A deploy that hit the 3-minute job timeout previously wrote its failure status on an already-cancelled context, which could leave the application stuck displaying Syncing.

Unified sync entry points

  • Added SyncApplications, the single path that all four sync triggers funnel through (polling, manual repo sync, webhooks, GitHub Actions). It performs consistent repository-status bookkeeping (Syncing → Success/Failed).
  • Added CommitResolver with StaticCommit (webhooks and GitHub Actions already carry the pushed SHA) and LatestCommit (polling and commit-less generic webhooks), separating "where the commit comes from" from the shared sync logic.
  • SyncRepository, WebhookHandler, handleGenericWebhook, and the GitHub Actions handler now delegate to SyncApplications; the bespoke enqueueGenericApps helper and the inline repository-status updates were removed.

Documentation

  • Doc comments make the two-layer model explicit (repository-level sync vs. application-level deploy) and clarify that a repository's SyncStatus reflects whether the sync was dispatched (commits resolved, jobs enqueued), not whether every application finished deploying — per-application progress lives on each Application.

Additional context

Intentional behavior changes (please review)

  1. A sync-path transport failure (agent dropped / nil result) now marks the application OutOfSync only, instead of also Unhealthy — matching the manual-deploy path. A dropped connection does not mean the running containers are unhealthy.
  2. Webhooks and GitHub Actions now perform the full Syncing → Success/Failed repository bookkeeping like the poller. Previously webhooks marked Success only (never Failed) and GitHub Actions did no repository-status bookkeeping at all. For these synchronous handlers the Syncing state is effectively instantaneous; it is only visibly observable on the polling path during the network commit lookup.
  3. Generic webhooks now skip applications with an empty branch (consistent with the poller) and mark the repository Failed if commit resolution fails, instead of silently enqueuing an empty commit.

Out of scope (planned follow-up)

Removing the Default* package-level globals in favor of dependency injection, which also requires reordering the server.go bootstrap. Deliberately left for a separate PR to keep this diff focused and reviewable.

Testing

applications, routes, and websocket suites pass; gofmt / go vet clean. The only failing tests are the Docker-daemon-dependent ones in internal/agent/docker, which this branch does not touch.

@maximbigler maximbigler force-pushed the refactor/deploy-state-machine branch from 6a3927e to 2b3b396 Compare June 24, 2026 14:58
@maximbigler maximbigler linked an issue Jun 26, 2026 that may be closed by this pull request
8 tasks
@maximbigler maximbigler marked this pull request as ready for review June 26, 2026 22:01
@alex289 alex289 requested a review from timokoessler June 26, 2026 23:34
@alex289

This comment has been minimized.

Comment thread frontend/src/routes/_authenticated/applications/$id.index.tsx Outdated
Comment thread backend/internal/hub/websocket/handler.go Outdated
Comment thread backend/internal/hub/websocket/delete.go Outdated
Comment thread backend/internal/hub/models/applications.go
Comment thread backend/internal/hub/websocket/deploy.go Outdated
Comment thread backend/internal/hub/websocket/deploy.go Outdated
Comment thread backend/internal/agent/docker/deploy.go
Comment thread frontend/src/routes/_authenticated/applications/$id/details.tsx Outdated
Comment thread backend/internal/hub/server.go Outdated
Comment thread backend/internal/agent/docker/status.go
Comment thread backend/internal/hub/crypto/crypto.go Outdated
@alex289

This comment was marked as outdated.

@alex289

This comment was marked as resolved.

@alex289

This comment was marked as resolved.

Comment thread backend/internal/hub/deployer/application_deployer.go Outdated
Comment thread backend/internal/hub/deployer/application_deployer.go
Comment thread frontend/src/routes/_authenticated/applications/$id/details.tsx Fixed
ALTER TABLE applications ADD COLUMN name_hash TEXT NOT NULL DEFAULT '';

-- Partial index excludes legacy rows (empty hash, pre-backfill) so they don't
-- collide on (agent_id, ''). Names were globally unique under the old rule, so

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did not check this before no?

// nonce, so we match on the name_hash blind index instead of decrypting every
// row. excludeID skips a record (the one being updated). Uniqueness is scoped per
// agent — the same name may exist on different agents.
func applicationNameTaken(ctx context.Context, name, agentID, excludeID string) (bool, error) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory we could just run the insert and catch duplicate key error like done elsewhere?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Normalize the compose name and skip check fix: Sync issues

4 participants