fix(meerkat-dbm): stop idle shutdown from terminating the worker mid-query by shriram-devrev · Pull Request #290 · devrev/meerkat

shriram-devrev · 2026-07-01T17:18:29Z

What

Two layered changes so an idle timer can't terminate the duckdb-wasm worker while a query is running:

Cancel the pending recycle/shutdown timers when a query starts (_startQueryQueue). They re-arm on the next queue drain. Primary fix — no idle timer is left pending during a query.
Guard the shutdown timer callback on _isBusy() (queue length OR queue running OR currentQueryItem), matching the recycle timer. Covers the event-loop edge where the timer callback was already queued before the new query cleared it.

Also removes the dead _isStaleWorkerError retry from #287.

Why (root cause, verified against prod)

The shutdownInactiveTime timer is armed only when the queue drains (_stopQueryQueue), and was never cleared when the next query started. So a timer armed on the previous drain kept counting down through the start of the next query. _startQueryExecution shifts the query off the queue before running it, so during execution queriesQueue.length === 0 while currentQueryItem is set and queryQueueRunning is true. The timer only checked queriesQueue.length > 0, so it treated the in-flight query as idle and called _shutdown() → terminateDB() mid-query.

duckdb-wasm does not throw for this — AsyncDuckDB.postTask on a detached worker only console.errors "cannot send a message since the worker is not set!" and resolves. Confirmed in prod RUM: every event is source: console, handling-stack console error → postTask. That's also why #287's catch-based self-heal never fired — the 0.1.45 build was empirically a no-op (fixed vs unfixed builds showed identical error rates).

Tests

does not terminate the worker when the shutdown timer elapses mid-query — query held in flight (slow preQuery) past shutdownInactiveTime must not trigger terminateDB, then shuts down normally once idle.
cancels the pending shutdown timer when a new query starts — a second query started before a prior armed timer elapses must clear it so it never fires.

Full meerkat-dbm suite green (20 dbm.spec + all others); nx build + nx lint clean.

Notes

Bumps @devrev/meerkat-dbm 0.1.45 → 0.1.46. devrev-web bumps to pick it up.
Scope: single-DBM race (the dominant production case). The cross-DBM shared-engine variant (multiple DBM types swapping one field) is out of scope here.
Supersedes fix(meerkat-dbm): self-heal stale DuckDB worker in query() #287 (no-op) and fix(meerkat-dbm): detect stale worker by engine identity, not error message #289 (isDetached — reconnects after the fact but doesn't prevent the mid-flight teardown). Both closed.

work-item: ISS-334477

…query The shutdownInactiveTime timer is armed only when the queue drains (_stopQueryQueue) but was never cleared when the next query started, so a timer armed on the previous drain kept counting down through the start of the next query. When it fired, it only checked `queriesQueue.length > 0` — but a query that has been shifted off the queue executes with queriesQueue.length === 0 (currentQueryItem set, queue running) — so it treated the in-flight query as idle and called terminateDB() mid-query, killing the duckdb-wasm worker while a RUN_QUERY / SET-TimeZone postTask was still in flight. duckdb-wasm does not throw here — postTask on a detached worker only console.errors "cannot send a message since the worker is not set!" and resolves — so the earlier catch-based self-heal (#287) never fired and the error kept surfacing to users on vista/list views. Fix (two layers): - Cancel the pending recycle/shutdown timers when a query starts (_startQueryQueue). They re-arm on the next queue drain. This is the primary fix: no idle timer is pending while a query runs. - Guard the shutdown timer callback on _isBusy() (queue length OR queue running OR currentQueryItem), matching the recycle timer. Covers the event-loop edge where the timer callback was already queued before the new query cleared it. Also removed the dead _isStaleWorkerError retry from #287 (it matched a thrown message that is only ever console.error'd, never thrown). Bumps meerkat-dbm 0.1.45 -> 0.1.46. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…'t kill mid-registration (#291) Follow-up to #290. The idle recycle/shutdown timers judged "idle" only by the query queue (queriesQueue / currentQueryItem). But file-buffer registration (consumers' fetchAndRegisterChunksWithIndexedDb) holds a table lock across its multi-second download and registers buffers on the worker OUTSIDE the query queue and the teardownInProgress barrier. So the timers saw the engine as idle during registration and terminated the worker mid-flight — the next registerFileBuffer/postTask then hit a dead worker ("cannot send a message since the worker is not set!"), which is what surfaced on vista/list views. Fix: - TableLockManager.hasActiveLocks(): reports whether any reader/writer lock is currently held. - DBM._isBusy() now also returns true when hasActiveLocks() — a held lock (i.e. an in-flight registration) blocks the idle recycle/shutdown. - The shutdown timer re-arms itself when it defers on a busy state, so the engine still idles down once a lock-only operation (no trailing query) finishes — no leaked warm engine. - setShutdownLock(false) re-arms the idle timer (fixes a latent leak: a timer that fired while locked returned early and was never rescheduled). Regression test added: a held table lock across the idle-shutdown window must not terminate the worker, and shutdown still fires once the lock releases. Bumps meerkat-dbm 0.1.46 -> 0.1.47. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

shriram-devrev requested review from itsTalwar, ujaval403, vpbs2 and zaidjan-devrev as code owners July 1, 2026 17:18

shriram-devrev force-pushed the fix/duckdb-shutdown-mid-query-race branch from c4b37c6 to db71bc6 Compare July 1, 2026 17:33

zaidjan-devrev reviewed Jul 1, 2026

View reviewed changes

Comment thread meerkat-dbm/src/dbm/dbm.ts

shriram-devrev force-pushed the fix/duckdb-shutdown-mid-query-race branch from db71bc6 to 2bb1e5b Compare July 1, 2026 18:11

zaidjan-devrev approved these changes Jul 1, 2026

View reviewed changes

shriram-devrev merged commit 23fa9ec into main Jul 1, 2026
4 of 5 checks passed

shriram-devrev mentioned this pull request Jul 1, 2026

fix(meerkat-dbm): treat held table locks as busy so idle shutdown can't kill mid-registration #291

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(meerkat-dbm): stop idle shutdown from terminating the worker mid-query#290

fix(meerkat-dbm): stop idle shutdown from terminating the worker mid-query#290
shriram-devrev merged 1 commit into
mainfrom
fix/duckdb-shutdown-mid-query-race

shriram-devrev commented Jul 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

shriram-devrev commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why (root cause, verified against prod)

Tests

Notes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shriram-devrev commented Jul 1, 2026 •

edited

Loading