fix: data branch create table + DROP DATABASE race condition by LeftHandCold · Pull Request #23788 · matrixorigin/matrixone

LeftHandCold · 2026-03-01T14:13:38Z

What type of PR is this?

Which issue(s) this PR fixes:

What this PR does / why we need it:

Two-layer fix for orphan records in mo_tables caused by concurrent data branch create table (CLONE) and DROP DATABASE, which leads to ExpectedEOB panic on checkpoint replay.

Fix 1 - Lock Service: Lock mode sync on holder promotion
Lock is a Go value type. When addHolder is called during tryHold,
modifications to l.value (lock mode byte) don't propagate to the
store. Add setMode() method to correctly update the stored lock
entry's mode when a waiter is promoted to holder (newHolder=true).
This ensures Exclusive locks actually block Shared requests.

Files: lock.go, lock_table_local.go, types.go

Fix 2 - DDL Layer: Snapshot refresh after exclusive lock (clock.Now)
After acquiring exclusive lock on mo_database in DropDatabase, call
refreshSnapshotAfterLock which uses clock.Now() + WaitLogTailAppliedAt

UpdateSnapshot to advance the transaction snapshot. This ensures Relations() sees all tables committed before the lock was acquired.

Uses clock.Now() instead of GetLatestCommitTS() (v1) because doWrite
defer LIFO ordering causes unlock to execute before updateLastCommitTS,
making the v1 conditional check always false in the critical scenario.

Files: ddl.go, ddl_test.go

Two-layer fix for orphan records in mo_tables caused by concurrent data branch create table (CLONE) and DROP DATABASE, which leads to ExpectedEOB panic on checkpoint replay. Fix 1 - Lock Service: Lock mode sync on holder promotion Lock is a Go value type. When addHolder is called during tryHold, modifications to l.value (lock mode byte) don't propagate to the store. Add setMode() method to correctly update the stored lock entry's mode when a waiter is promoted to holder (newHolder=true). This ensures Exclusive locks actually block Shared requests. Files: lock.go, lock_table_local.go, types.go Fix 2 - DDL Layer: Snapshot refresh after exclusive lock (clock.Now) After acquiring exclusive lock on mo_database in DropDatabase, call refreshSnapshotAfterLock which uses clock.Now() + WaitLogTailAppliedAt + UpdateSnapshot to advance the transaction snapshot. This ensures Relations() sees all tables committed before the lock was acquired. Uses clock.Now() instead of GetLatestCommitTS() (v1) because doWrite defer LIFO ordering causes unlock to execute before updateLastCommitTS, making the v1 conditional check always false in the critical scenario. Files: ddl.go, ddl_test.go

qodo-code-review · 2026-03-01T14:14:01Z

Review Summary by Qodo

Fix race condition between CLONE and DROP DATABASE via lock mode sync

🐞 Bug fix

Walkthroughs

Description

• Fix lock mode synchronization when waiter promoted to holder
  - Add setMode() method to update stored lock entry mode
  - Ensures Exclusive locks block Shared requests correctly
• Fix snapshot staleness in DROP DATABASE with concurrent CLONE
  - Use clock.Now() instead of GetLatestCommitTS() for snapshot refresh
  - Prevents orphan records in mo_tables and ExpectedEOB panic
• Add comprehensive tests for lock mode transitions
  - Verify Exclusive holder blocks Shared requests
  - Verify Shared holder allows subsequent Shared requests

Diagram

flowchart LR
  A["Concurrent CLONE<br/>and DROP DATABASE"] -->|Lock mode<br/>not synced| B["Stale lock mode<br/>in storage"]
  B -->|Shared requests<br/>incorrectly allowed| C["Orphan records<br/>in mo_tables"]
  C -->|Checkpoint replay| D["ExpectedEOB panic"]
  
  E["Fix 1: setMode()"] -->|Sync mode on<br/>holder promotion| F["Correct lock mode<br/>in storage"]
  G["Fix 2: clock.Now()<br/>snapshot refresh"] -->|Advance snapshot<br/>after exclusive lock| H["Relations() sees<br/>all committed tables"]
  F --> I["Race condition<br/>resolved"]
  H --> I

File Changes

1. pkg/lockservice/lock.go 🐞 Bug fix +23/-0

Add setMode() for lock mode synchronization

• Add setMode() method to update lock mode in stored lock entry
• Handles transitions between Shared and Exclusive modes
• Includes detailed comments explaining value type semantics

pkg/lockservice/lock.go

2. pkg/lockservice/lock_table_local.go 🐞 Bug fix +6/-0

Apply setMode() on holder promotion in lock acquisition
• Call setMode() after waiter promotion in acquireRowLockLocked()
• Call setMode() after waiter promotion in addRangeLockLocked()
• Write updated lock back to store when mode changes
pkg/lockservice/lock_table_local.go

3. pkg/lockservice/lock_table_local_test.go 🧪 Tests +129/-0

Add tests for lock mode transition correctness

• Add TestExclusiveHolderMustBlockSharedRequests test
 - Verifies Exclusive holder blocks subsequent Shared requests
 - Tests waiter promotion scenario
• Add TestSharedAfterExclusiveRelease test
 - Verifies Shared holder allows subsequent Shared requests
 - Tests mode transition from Exclusive to Shared

pkg/lockservice/lock_table_local_test.go

View more (3)

4. pkg/lockservice/types.go 📝 Documentation +6/-0

Document Lock value type behavior and setMode() requirement
• Add WARNING comment explaining Lock value type semantics
• Document that modifications to value field don't propagate to storage
• Explain setMode() requirement for mode changes
pkg/lockservice/types.go

5. pkg/sql/compile/ddl.go 🐞 Bug fix +31/-20

Replace conditional snapshot refresh with clock.Now() approach

• Replace GetLatestCommitTS() conditional check with unconditional clock.Now() approach
• Extract snapshot refresh logic into refreshSnapshotAfterLock() function
• Use clock.Now() + WaitLogTailAppliedAt() to advance snapshot
• Add detailed comments explaining defer LIFO ordering issue

pkg/sql/compile/ddl.go

6. pkg/sql/compile/ddl_test.go 🧪 Tests +35/-29

Update tests for unconditional snapshot refresh logic

• Simplify test to reflect unconditional clock.Now() approach
• Remove snapshotTS and latestCommitTS comparison logic
• Rename test case to snapshot_always_refreshed_via_clock_now
• Add test for error propagation from refreshSnapshotAfterLock
• Remove test case for conditional snapshot refresh (no longer applicable)

pkg/sql/compile/ddl_test.go

qodo-code-review · 2026-03-01T14:14:02Z

Code Review by Qodo

🐞 Bugs (3) 📘 Rule violations (0) 📎 Requirement gaps (0)

1. ~~Snapshot refresh may stall~~ ☑ 🐞 Bug ⛯ Reliability

Description

refreshSnapshotAfterLock waits via TxnClient.WaitLogTailAppliedAt(now) and then calls
TxnOperator.UpdateSnapshot(ts), but both paths ultimately depend on TimestampWaiter.GetTimestamp();
since GetTimestamp returns latest.Next(), passing that result into UpdateSnapshot can require a
subsequent logtail advance and may add latency or stall DDL under low commit activity. This is
triggered for every Exclusive mo_database lock (e.g. CreateDatabase and DropDatabase) because the
call is unconditional for lockMode==Exclusive.

Code

pkg/sql/compile/ddl.go[R4018-4024]

+var refreshSnapshotAfterLock = func(c *Compile) error {
+	now, _ := moruntime.ServiceRuntime(c.proc.GetService()).Clock().Now()
+	ts, err := c.proc.Base.TxnClient.WaitLogTailAppliedAt(c.proc.Ctx, now)
+	if err != nil {
+		return err
+	}
+	return c.proc.GetTxnOperator().UpdateSnapshot(c.proc.Ctx, ts)

Evidence

refreshSnapshotAfterLock first calls TxnClient.WaitLogTailAppliedAt(now) and then
UpdateSnapshot(ts). WaitLogTailAppliedAt delegates to timestampWaiter.GetTimestamp, which returns
latest.Next(); UpdateSnapshot again calls timestampWaiter.GetTimestamp, so if the first call
returned latest.Next() (greater than latest), the second call can block until the waiter’s latest
timestamp advances to >= that value (i.e., an additional logtail/commit tick). Also, lockMoDatabase
calls refreshSnapshotAfterLock unconditionally for every Exclusive lock, expanding the impact to
DDLs beyond DropDatabase.

pkg/sql/compile/ddl.go[3983-4010]
pkg/sql/compile/ddl.go[4014-4025]
pkg/txn/client/client.go[515-522]
pkg/txn/client/timestamp_waiter.go[61-79]
pkg/txn/client/operator.go[714-740]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`refreshSnapshotAfterLock` currently calls `TxnClient.WaitLogTailAppliedAt(now)` and then `TxnOperator.UpdateSnapshot(ts)`. Both use `TimestampWaiter.GetTimestamp`, and `GetTimestamp` returns `latest.Next()`, so feeding its result into `UpdateSnapshot` can require an additional logtail/commit advance and may add latency or stall DDL under low activity.
### Issue Context
This helper is invoked for every `lockMoDatabase(..., Exclusive)` call, so the risk applies to multiple DDLs (not just `DropDatabase`).
### Fix Focus Areas
- pkg/sql/compile/ddl.go[3983-4025]
- pkg/txn/client/client.go[515-522]
- pkg/txn/client/timestamp_waiter.go[61-79]
- pkg/txn/client/operator.go[714-740]
### Suggested change
- Replace the helper body with a single snapshot update call, e.g.:
- compute `now` and then call `txnOp.UpdateSnapshot(ctx, now)` (no explicit `WaitLogTailAppliedAt` call).
- Reintroduce guards similar to the old code:
- only do this for pessimistic + RC isolation (or at minimum RC), and
- only when `txnOp.Txn().SnapshotTS` is behind the chosen timestamp to reduce unnecessary waits.
- If only `DropDatabase` needs this behavior, move the refresh call out of the shared `lockMoDatabase` helper and into `DropDatabase` (or add a parameter/flag to control it).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. ~~Range-lock mode inconsistency~~ ☑ 🐞 Bug ✓ Correctness

Description

When a waiter is promoted to holder, the new code writes back an updated Lock mode for only the
single conflicting key; but a range lock is represented by two store entries (range-start and
range-end) with independent Lock.value bytes, so the paired entry can retain stale mode and still
allow/deny Shared requests incorrectly (especially at boundaries where a row lock can hit the
range-start key).

Code

pkg/lockservice/lock_table_local.go[R655-657]

+					if updated, changed := conflictWith.setMode(c.opts.Mode); changed {
+						l.mu.store.Add(conflictKey, updated)
+					}

Evidence

Range locks are created as two distinct Lock values (start/end) derived from one base lock; each
copy has its own value byte. The promotion fix updates mode via store.Add(conflictKey, updated)
but only for the one key being operated on, so the other half of the range pair can keep stale mode
bits. This matters because row-lock acquisition may operate on a range-start entry when
bytes.Equal(key,row) (i.e., when the row equals the range boundary).

pkg/lockservice/lock_table_local.go[401-427]
pkg/lockservice/lock_table_local.go[645-663]
pkg/lockservice/lock.go[47-53]
pkg/lockservice/lock.go[243-251]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Range locks are represented by two entries (range-start and range-end) with independent `Lock.value` bytes. The new promotion write-back updates only one key’s `value` via `store.Add`, so the paired entry can keep stale mode bits and still be consulted by later lock attempts (notably at boundary keys).
### Issue Context
- `newRangeLock` returns two `Lock` values; the range-start/range-end bits are set via value receivers, so each entry’s `value` diverges.
- `acquireRowLockLocked` can hit a range-start entry when the row equals the start boundary (`bytes.Equal(key,row)`), so stale mode on the start entry can still matter.
### Fix Focus Areas
- pkg/lockservice/lock_table_local.go[401-427]
- pkg/lockservice/lock_table_local.go[645-663]
- pkg/lockservice/lock.go[47-53]
- pkg/lockservice/lock.go[243-251]
### Suggested change
- When `setMode(...)` reports `changed==true` and the lock is part of a range lock, locate the paired entry (start↔end) and apply the same mode update to it as well.
- One approach under `l.mu`:
- If `conflictWith.isLockRangeEnd()`, walk backwards with `store.Prev(...)` until you find a `isLockRangeStart()` whose `holders` pointer (or `waiters` backing queue identity) matches.
- If `conflictWith.isLockRangeStart()`, walk forwards with `store.Seek/Range` until you find the matching `isLockRangeEnd()`.
- Add a targeted test that constructs a range lock, promotes an exclusive waiter to holder, and then attempts a shared row lock exactly at the range start key to ensure it blocks.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

3. ~~DDL test hides real logic~~ ☑ 🐞 Bug ⛯ Reliability

Description

The updated DropDatabase test stubs refreshSnapshotAfterLock itself, so it does not validate the
production logic (clock.Now()+waiter behavior) and can miss regressions in the actual implementation
semantics.

Code

pkg/sql/compile/ddl_test.go[R904-913]

+		// Stub refreshSnapshotAfterLock to simulate the clock.Now() path:
+		// it calls WaitLogTailAppliedAt and UpdateSnapshot unconditionally.
+		refreshStub := gostub.Stub(&refreshSnapshotAfterLock, func(c *Compile) error {
+			ts, err := c.proc.Base.TxnClient.WaitLogTailAppliedAt(c.proc.Ctx, appliedTS)
+			if err != nil {
+				return err
+			}
+			return c.proc.GetTxnOperator().UpdateSnapshot(c.proc.Ctx, ts)
+		})
+		defer refreshStub.Reset()

Evidence

By replacing refreshSnapshotAfterLock with a stub, the test only asserts that DropDatabase invokes
the helper and propagates errors; it does not exercise the real code path that reads the runtime
clock and performs the actual wait/update behavior.

pkg/sql/compile/ddl_test.go[904-913]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Current tests stub `refreshSnapshotAfterLock`, which prevents them from validating the real implementation and the clock.Now() behavior.
### Issue Context
The helper’s correctness depends on clock.Now() and timestamp-waiter semantics; stubbing the helper bypasses those.
### Fix Focus Areas
- pkg/sql/compile/ddl_test.go[822-975]
- pkg/sql/compile/ddl.go[4014-4025]
### Suggested change
- Add a focused unit test for `refreshSnapshotAfterLock` itself:
- Provide a controllable runtime clock (or a mock ServiceRuntime/Clock if available in test utilities).
- Use a real or mocked `TimestampWaiter`/`TxnClient` behavior that reflects production semantics.
- Assert that `UpdateSnapshot` is called with the expected input and that the function does not require additional external stubbing.
- Keep the existing integration-style DropDatabase test, but avoid stubbing the helper in the main “happy path” subtest if feasible.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

ⓘ The new review experience is currently in Beta. Learn more

pkg/sql/compile/ddl.go

pkg/lockservice/lock_table_local.go

…e only refreshSnapshotAfterLock was inside lockMoDatabase and triggered for all Exclusive lock acquisitions. CreateDatabase also calls lockMoDatabase with Exclusive mode, so during restore cluster operations the snapshot advance caused CreateDatabase to see already-restored tables, producing: Duplicate entry '(0,mo_catalog,mo_branch_metadata)' for key '__mo_cpkey_col' Move refreshSnapshotAfterLock to DropDatabase directly, since only DropDatabase needs the snapshot advance (to enumerate tables via Relations()). lockMoDatabase is now a simple wrapper around doLockMoDatabase.

…eordering doWrite defers Two fixes for the data-branch-create-table + DROP DATABASE race that leaves orphan rows in mo_tables: 1. lockservice: sync mode on both ends of range lock (lock_table_local.go) When a waiter is promoted to holder, setMode updates Lock.value on the conflicting key only. But a range lock is stored as two independent btree entries (range-start and range-end) with separate Lock.value bytes. The paired entry retains stale mode, causing isLockModeAllowed to incorrectly allow or deny Shared requests. Add setModePairedRangeLock helper that scans past interleaved row locks to find and update the paired entry. Called from both acquireRowLockLocked and addRangeLockLocked when newHolder=true. No-op for row locks. 2. txn: reorder defers in doWrite to update latestCommitTS before unlock (operator.go) Original code registered two independent defers in doWrite: defer tc.unlock(ctx) // LIFO: executes first defer func() { closeLocked(); mu.Unlock() }() // executes second This meant unlock released the lock-service lock before closeLocked triggered updateLastCommitTS. In the window between unlock and closeLocked, DROP DATABASE could acquire the Exclusive lock and call GetLatestCommitTS, getting a stale value. The UpdateSnapshot fix (layer 2) would then not trigger because SnapshotTS >= stale latestCommitTS. Merge unlock into the first defer so execution order becomes: closeLocked() -> mu.Unlock() -> unlock(ctx) This ensures latestCommitTS is updated before the lock is released.

…ne into fix_clone_drop_race

mergify · 2026-03-05T07:42:11Z

Merge Queue Status

Rule: main

✅ Entered queue — 2026-03-05 07:41 UTC
✅ Checks passed · in-place
✅ Merged — 2026-03-05 07:42 UTC · at 1d8aaaf713fc514fd4929fceec3ac99069134f38

This pull request spent 10 seconds in the queue, with no time running CI.

Required conditions to merge

LeftHandCold requested review from aunjgr, iamlinjunhong and ouyuanning as code owners March 1, 2026 14:13

iamlinjunhong approved these changes Mar 1, 2026

View reviewed changes

ouyuanning approved these changes Mar 1, 2026

View reviewed changes

mergify bot added the kind/bug Something isn't working label Mar 1, 2026

LeftHandCold temporarily deployed to ci March 1, 2026 14:18 — with GitHub Actions Inactive

LeftHandCold had a problem deploying to ci March 1, 2026 14:18 — with GitHub Actions Failure

LeftHandCold temporarily deployed to ci March 1, 2026 14:18 — with GitHub Actions Inactive

matrix-meow added the size/M Denotes a PR that changes [100,499] lines label Mar 1, 2026

qodo-code-review bot reviewed Mar 1, 2026

View reviewed changes

pkg/sql/compile/ddl.go Outdated Show resolved Hide resolved

pkg/lockservice/lock_table_local.go Show resolved Hide resolved

LeftHandCold had a problem deploying to ci March 1, 2026 15:18 — with GitHub Actions Failure

LeftHandCold temporarily deployed to ci March 1, 2026 15:47 — with GitHub Actions Inactive

LeftHandCold temporarily deployed to ci March 2, 2026 09:50 — with GitHub Actions Inactive

U

3cf6494

LeftHandCold had a problem deploying to ci March 4, 2026 08:10 — with GitHub Actions Error

LeftHandCold temporarily deployed to ci March 4, 2026 08:10 — with GitHub Actions Inactive

LeftHandCold had a problem deploying to ci March 4, 2026 08:10 — with GitHub Actions Error

LeftHandCold had a problem deploying to ci March 4, 2026 08:10 — with GitHub Actions Failure

LeftHandCold had a problem deploying to ci March 4, 2026 08:10 — with GitHub Actions Error

LeftHandCold temporarily deployed to ci March 4, 2026 08:10 — with GitHub Actions Inactive

LeftHandCold had a problem deploying to ci March 4, 2026 08:10 — with GitHub Actions Error

LeftHandCold temporarily deployed to ci March 4, 2026 08:32 — with GitHub Actions Inactive

LeftHandCold had a problem deploying to ci March 4, 2026 08:32 — with GitHub Actions Failure

LeftHandCold temporarily deployed to ci March 4, 2026 08:32 — with GitHub Actions Inactive

LeftHandCold and others added 8 commits March 4, 2026 17:28

fix SCA

df10b82

add ut

9877ca3

fix ut

180cc0d

add ut

b85b8bd

Merge branch 'main' into fix_clone_drop_race

2045d9b

U

e9bac46

Merge branch 'fix_clone_drop_race' of github.com:LeftHandCold/matrixo…

1d8aaaf

…ne into fix_clone_drop_race

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: data branch create table + DROP DATABASE race condition#23788

fix: data branch create table + DROP DATABASE race condition#23788
mergify[bot] merged 13 commits intomatrixorigin:mainfrom
LeftHandCold:fix_clone_drop_race

LeftHandCold commented Mar 1, 2026 •

edited

Loading

Uh oh!

qodo-code-review bot commented Mar 1, 2026

Uh oh!

qodo-code-review bot commented Mar 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

LeftHandCold commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this?

Which issue(s) this PR fixes:

What this PR does / why we need it:

Uh oh!

qodo-code-review bot commented Mar 1, 2026

Review Summary by Qodo

Walkthroughs

File Changes

Uh oh!

qodo-code-review bot commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Mar 5, 2026

Merge Queue Status

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

LeftHandCold commented Mar 1, 2026 •

edited

Loading

qodo-code-review bot commented Mar 1, 2026 •

edited

Loading