feat(mergeconflict): make merge-conflict check asynchronous via runway#245
Draft
behinddwalls wants to merge 1 commit into
Draft
feat(mergeconflict): make merge-conflict check asynchronous via runway#245behinddwalls wants to merge 1 commit into
behinddwalls wants to merge 1 commit into
Conversation
This was referenced Jun 15, 2026
b8f3e22 to
ddbb5ab
Compare
2d68e19 to
e6d2164
Compare
ddbb5ab to
edf7ab9
Compare
e6d2164 to
a51a86b
Compare
behinddwalls
commented
Jun 16, 2026
|
|
||
| **Outputs are unchanged except `pusher`.** This RFC moves the *input* toward identity; five of the six return contracts — conflicts, score, mergeability, change info, build id/status — are exactly what they are today. `pusher` is the lone exception: because its input becomes a *list* of independently-landed batches, its result regroups per batch (`BatchID`-tagged, per-change commit detail kept underneath) so each batch's outcome stays correlatable — the "output mirrors the input unit" principle above. No other output shape changes. | ||
|
|
||
| > **Note (superseded for the validate path).** The `mergechecker.MergeChecker` row above describes the *synchronous*, in-process mergeability check `validate` used to run. That check is now **asynchronous and out-of-process**: `validate` hands off to the `mergeconflict` controller, which publishes a full check request to the runway-owned `merge-conflict-checker` queue; runway performs the merge attempt and returns a result that `mergeconflictsignal` consumes (see [workflow.md](workflow.md)). The `mergechecker` extension package is kept in-tree but unused on the validate path; removing it is a follow-up. The row is retained here as the historical contract. |
Collaborator
Author
There was a problem hiding this comment.
lets update the RFC to reflect the new contract rather than adding a note here
Comment on lines
+30
to
+34
| // TopicKeyMergeConflict is the internal pipeline stage where validated | ||
| // requests are published to start an asynchronous merge-conflict check. The | ||
| // mergeconflict controller consumes it, records the check, and publishes the | ||
| // full request to runway's merge-conflict-checker queue. | ||
| TopicKeyMergeConflict TopicKey = "mergeconflict" |
Collaborator
Author
There was a problem hiding this comment.
records the check? we removed storage layer so no recording anymore, request itself captures the state for it.
Comment on lines
+122
to
+124
| // User error: reject to DLQ, where the request is reconciled to Error. | ||
| return errs.NewUserError(fmt.Errorf("request %s is not mergeable: %s", request.ID, result.Reason)) | ||
| } |
Collaborator
Author
There was a problem hiding this comment.
why reject to DLQ when we know the request has failed and we should just mark it failed?
## Summary ### Why? The merge-conflict check ran synchronously inside the `validate` consumer by calling the `mergechecker` extension inline. A real merge attempt is slow and I/O-heavy, so doing it on the partition lease blocks the pipeline and couples SubmitQueue to the checker's latency. This moves the check to an asynchronous round-trip with runway, modelled on `build`/`buildsignal` but across a service boundary. ### What? The pipeline gains `validate → mergeconflict ⇢ (runway) ⇢ mergeconflictsignal → batch`: - `validate` drops the inline `mergechecker` call and now publishes the request id to the internal `mergeconflict` topic (dedup + change-metadata + claim are unchanged). - `mergeconflict` (new) publishes the full `MergeConflictCheckRequest` to the runway-owned `merge-conflict-checker` queue, keyed by the request id as the client-owned correlation id. No local record is needed — the request id round-trips, so the result correlates straight back to the request (unlike `build`, whose server-generated id needs a mapping store). - `mergeconflictsignal` (new) consumes runway's `MergeConflictCheckResult` off `merge-conflict-checker-signal`, advances the request to `batch` when mergeable, or fails it (user error) when conflicted. - DLQ reconcilers drive the request to `Error` on dead-letter; the signal DLQ reads the request id straight off the result. Crossing the runway boundary is why these payloads carry full data rather than entity IDs; the new queue-payload-boundary rule is documented in CLAUDE.md, with the pipeline diagram and stage table updated in workflow.md and the superseded `mergechecker` validate-path row noted in extension-contract.md. The `mergechecker` package is left in-tree (unused on the validate path); removing it is a follow-up. Runway's service implementation is out of scope — only its contract (added in the parent PR) is consumed here. ## Test Plan ✅ `bazel build //...` ✅ `bazel test //... --test_tag_filters=-integration,-e2e` (54 tests pass) ✅ `make gazelle` clean
a51a86b to
d565b91
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Why?
The merge-conflict check ran synchronously inside the
validateconsumer by calling themergecheckerextension inline. A real merge attempt is slow and I/O-heavy, so doing it on the partition lease blocks the pipeline and couples SubmitQueue to the checker's latency. This moves the check to an asynchronous round-trip with runway, modelled onbuild/buildsignalbut across a service boundary.What?
The pipeline gains
validate → mergeconflict ⇢ (runway) ⇢ mergeconflictsignal → batch:validatedrops the inlinemergecheckercall and now publishes the request id to the internalmergeconflicttopic (dedup + change-metadata + claim are unchanged).mergeconflict(new) publishes the fullMergeConflictCheckRequestto the runway-ownedmerge-conflict-checkerqueue, keyed by the request id as the client-owned correlation id. No local record is needed — the request id round-trips, so the result correlates straight back to the request (unlikebuild, whose server-generated id needs a mapping store).mergeconflictsignal(new) consumes runway'sMergeConflictCheckResultoffmerge-conflict-checker-signal, advances the request tobatchwhen mergeable, or fails it (user error) when conflicted.Erroron dead-letter; the signal DLQ reads the request id straight off the result.Crossing the runway boundary is why these payloads carry full data rather than entity IDs; the new queue-payload-boundary rule is documented in CLAUDE.md, with the pipeline diagram and stage table updated in workflow.md and the superseded
mergecheckervalidate-path row noted in extension-contract.md. Themergecheckerpackage is left in-tree (unused on the validate path); removing it is a follow-up. Runway's service implementation is out of scope — only its contract (added in the parent PR) is consumed here.Test Plan
✅
bazel build //...✅
bazel test //... --test_tag_filters=-integration,-e2e(54 tests pass)✅
make gazellecleanStack