Skip to content

feat(insert): align ON DUPLICATE KEY UPDATE with MySQL, drop legacy operator#24973

Open
ck89119 wants to merge 3 commits into
matrixorigin:mainfrom
ck89119:issue-24939-main
Open

feat(insert): align ON DUPLICATE KEY UPDATE with MySQL, drop legacy operator#24973
ck89119 wants to merge 3 commits into
matrixorigin:mainfrom
ck89119:issue-24939-main

Conversation

@ck89119

@ck89119 ck89119 commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #24939

What this PR does / why we need it:

Replace the legacy Node_ON_DUPLICATE_KEY operator with the modern
DEDUP JOIN + MULTI_UPDATE path for INSERT ... ON DUPLICATE KEY UPDATE,
and make real-PK tables align with MySQL.

Behavior change (real-PK, MySQL alignment)

Previously, on a table with a real primary key, only a primary-key conflict
triggered the update; any unique-key conflict raised a duplicate-entry error.
Now any unique-key conflict (PRIMARY or a secondary UNIQUE key) updates the
first conflicting row, in PRIMARY > unique-key definition order — matching
MySQL.

Implementation: real-PK ODKU resolves a single UPDATE target up front via
target_pk = coalesce(pk-existence-probe, uk1_pri, uk2_pri, ...) (treating
PRIMARY as the 0th index) and keys the main dedup-update join on target_pk.
The conflicting row is updated in place (its primary key and non-SET columns
are preserved). The per-unique-key FAIL dedup is kept as in-batch duplicate
protection: two brand-new rows that share a new unique-key value in the same
statement still error deterministically (documented gap vs MySQL's in-batch
sequential semantics).

Modern path coverage

  • fake-PK tables (first usable unique key as the anchor) and FK-table ODKU run
    on the modern path; FK parent existence is enforced via generated DetectSqls.
  • Tables carrying irregular (vector / fulltext) indexes run on the modern path;
    the irregular index is stripped by getValidIndexes and maintained
    asynchronously.

Cleanup

Remove the now-dead Node_ON_DUPLICATE_KEY enum, the OnDuplicateKeyCtx and
pipeline.OnDuplicateKey messages, and the OnDuplicateKey vm opcode
(reserved in proto).

Tests

  • Plan unit tests for real-PK single / composite unique-key resolution.
  • End-to-end behavior cross-checked against MySQL 8.4.
  • BVT: insert_duplicate, on_duplicate_key_edge (rewritten),
    on_duplicate_key_modern (real-PK multi-UK + in-batch protection +
    irregular-index cases); the whole dml/insert suite passes on the
    multi-CN cluster.

🤖 Generated with Claude Code

…perator

Replace the legacy Node_ON_DUPLICATE_KEY operator with the modern
DEDUP JOIN + MULTI_UPDATE path for INSERT ... ON DUPLICATE KEY UPDATE, and
make real-PK tables align with MySQL: any unique-key conflict (PRIMARY or a
secondary UNIQUE key) updates the first conflicting row, in
PRIMARY > unique-key definition order, instead of raising a duplicate-entry
error.

- real-PK ODKU resolves a single UPDATE target up front via
  target_pk = coalesce(pk-existence-probe, uk1_pri, uk2_pri, ...), treating
  PRIMARY as the 0th index, and keys the main dedup-update join on target_pk.
  The per-unique-key FAIL dedup is kept as in-batch duplicate protection
  (two brand-new rows sharing a new unique-key value still error).
- fake-PK tables (first unique key as anchor) and FK-table ODKU are handled
  on the modern path; FK parent existence is enforced via generated DetectSqls.
- Tables carrying irregular (vector / fulltext) indexes use the modern path;
  the irregular index is stripped by getValidIndexes and maintained
  asynchronously.
- Remove the now-dead Node_ON_DUPLICATE_KEY enum, the OnDuplicateKeyCtx and
  pipeline.OnDuplicateKey messages, and the OnDuplicateKey vm opcode.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@qodo-code-review

Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

…table

Dropping the legacy ON DUPLICATE KEY operator broke ivfflat/hnsw/cagra/fulltext
index creation. Index maintenance upserts a version counter into the index
metadata table via INSERT ... ON DUPLICATE KEY UPDATE. That metadata table is a
plain real-PK key/value table, but it carries an algo-specific TableType
("metadata"/"hnsw_meta"/...) and a secondary-index name, so:

  - the modern insert guard rejected it as "insert into vector/text index table",
    and the ODKU could no longer fall back to the (now removed) legacy operator;
  - canSkipDedup skipped the dedup-update join the ODKU needs to fetch the old row.

Fix on the modern path: allow an ODKU-update through the index-table guard, and
do not skip dedup when onDupAction is UPDATE. Plain index-maintenance inserts
(no ODKU) keep their existing behavior. The metadata table is a normal real-PK
table the modern dedup + multi-update path handles correctly.

Adds a plan unit test (ODKU into a metadata-typed index table builds a
MULTI_UPDATE) and a BVT covering ivfflat index creation plus user ODKU on the
indexed table.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/feature kind/refactor Code refactor size/XXL Denotes a PR that changes 2000+ lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants