fix: support MySQL user-level lock functions#24909
Conversation
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
XuPeng-SH
left a comment
There was a problem hiding this comment.
I found multiple blocking issues in the current implementation.
-
The named-lock state is only a package-global in-memory map inside one CN process. In MatrixOne this means two sessions routed to different CNs can both acquire the same lock name at the same time, so the core exclusivity guarantee is broken for real cross-session coordination.
-
Locks are never released when the owning session disconnects. The only mutations of the new lock map are explicit acquire/release calls, and this PR does not add any session-teardown cleanup path. So a client that exits without
RELEASE_LOCK()can leave the lock stuck forever until the CN restarts, which is not compatible with MySQL named-lock semantics. -
GET_LOCK(name, timeout)handles timeout semantics incorrectly:timeout <= 0returns0immediately, but MySQL compatibility requirestimeout = 0to mean no-wait and negative timeout to mean wait indefinitely (subject to session cancellation). -
The unhappy-path coverage is still thin around the new blocking behavior. There are no tests for disconnect cleanup, successful wait-then-acquire, timeout expiry, context cancellation while waiting, or the reentrant refcount path.
Because the first two are real behavior bugs and the third is a compatibility bug in the exposed SQL contract, I am requesting changes.
c6ab3b7 to
5ca0b17
Compare
|
Updated the PR to address the review request:
Tests run:
|
XuPeng-SH
left a comment
There was a problem hiding this comment.
Re-reviewed latest head 5b3f328768fdf241787f6ca7f21b4030bbba4b8f. The previous RELEASE_LOCK() NULL semantics issue is fixed, but there are still correctness gaps in lock-name handling that can break MySQL compatibility and mutual exclusion.
Blocking issues:
-
pkg/sql/plan/function/func_unary.go:6824:validateUserLevelLockNameuseslen(name), so the 64 limit is enforced in bytes, not characters. MySQL documents the limit as 64 characters, and the MySQL WL for this feature says lock names are converted to UTF-8 before applying the new 64-character semantics: https://dev.mysql.com/doc/refman/8.4/en/locking-functions.html and https://dev.mysql.com/worklog/task/?id=1159. A name of 64 Chinese characters currently returnsinternal error: user-level lock name exceeds maximum length of 64 characters, even though it is exactly 64 characters. MatrixOne already has UTF-8 length helpers in this package; this should validate character/code-point length, and add a multibyte boundary test. -
pkg/sql/plan/function/func_unary.go:6838/pkg/sql/plan/function/func_unary.go:6938: lock names are used as raw strings for the distributed row key and the local refcount key, making the implementation case-sensitive. MySQL's user-lock implementation performs case-insensitive comparison and uses a lowercased lock name as the MDL key (see the same WL above). Today two sessions can both acquire what MySQL treats as the same lock: session AGET_LOCK('case_lock', 0)returns 1, then session BGET_LOCK('CASE_LOCK', 0)also returns 1. That breaks the advisory-lock mutual exclusion guarantee. Please normalize the lock name consistently before validating/tracking/locking and add contention tests across case variants. -
pkg/sql/plan/function/func_unary.go:6824: empty lock names are accepted because validation only checks> 64. MySQL's WL calls NULL, empty string, and longer-than-64 names wrong-name inputs. CurrentGET_LOCK('', 0)creates a real distributed lock, which leaves MatrixOne with non-MySQL state and release semantics for an invalid name. Please reject''consistently inGET_LOCK,RELEASE_LOCK, andIS_FREE_LOCK, with tests.
Local verification:
- PR tests pass:
go test ./pkg/sql/plan/function -run 'TestUserLevelLock|TestReleaseUserLevelLocksCleanup|TestUserLevelLockBuiltinRegistration|Test_funids' -count=1 - Frontend compile check passes:
go test ./pkg/frontend -run TestNonExistent -count=1 git diff --check upstream/4.0-dev...HEADpasses- Added temporary review-only tests for the three cases above; all failed as described, then removed them before submitting this review.
- Replace validateUserLevelLockName with normalizeUserLevelLockName that lowercases names for case-insensitive matching (MySQL requirement). - Enforce 64-character limit using utf8.RuneCountInString (character count), not len() (byte count), matching MySQL's documented behavior. - Reject empty lock names in GET_LOCK, RELEASE_LOCK, and IS_FREE_LOCK. - Add tests for empty-name rejection, case-insensitive contention, and multibyte character boundary (64/65 Chinese characters). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ock-functions # Conflicts: # pkg/sql/plan/function/function_id.go # pkg/sql/plan/function/function_id_test.go
XuPeng-SH
left a comment
There was a problem hiding this comment.
Reviewed the latest fix approach, ignoring CI status. The implementation now addresses the earlier correctness issues around distributed locking, session cleanup, timeout semantics, RELEASE_LOCK NULL/0 semantics, empty/overlong names, multibyte length, and case-insensitive lock names. However, this PR introduces new public SQL functions (GET_LOCK, RELEASE_LOCK, IS_FREE_LOCK) for Prisma/MySQL compatibility and currently adds only Go unit tests; there is no BVT/end-to-end SQL case under test/distributed/cases. Please add at least one function BVT covering the SQL contract, e.g. GET_LOCK('prisma_migrate_lock', 0), IS_FREE_LOCK before/after acquire/release, RELEASE_LOCK for held and never-created locks, and NULL argument behavior. The fake-lockservice unit tests are strong, but they do not verify SQL parsing/planning/execution or result NULL/0/1 behavior through mo-tester.
XuPeng-SH
left a comment
There was a problem hiding this comment.
Re-reviewed the latest head with CI status ignored. The previous issues are addressed: the implementation uses the lockservice for cross-CN coordination, releases session-owned locks on Session.Close, handles timeout semantics, returns the correct RELEASE_LOCK 1/0/NULL states, normalizes names case-insensitively, validates empty/overlong/multibyte names, and preserves reentrant ownership. The new BVT now covers the public SQL path for GET_LOCK/RELEASE_LOCK/IS_FREE_LOCK, including NULL and never-created lock behavior. No blocking issue found in the fix approach.
What type of PR is this?
Which issue(s) does this PR fix or relate to?
Fixes #24728
What this PR does / why we need it:
What
Why
Fixes #24728. Prisma Migrate calls GET_LOCK before applying migrations.
Tests