Skip to content

fix(meerkat-dbm): detect stale worker by engine identity, not error message#289

Closed
shriram-devrev wants to merge 1 commit into
mainfrom
fix/duckdb-isdetached-identity-check
Closed

fix(meerkat-dbm): detect stale worker by engine identity, not error message#289
shriram-devrev wants to merge 1 commit into
mainfrom
fix/duckdb-isdetached-identity-check

Conversation

@shriram-devrev

Copy link
Copy Markdown
Contributor

What

Corrects #287. _getConnection() now tracks the engine instance (connectionDb) a cached connection is bound to, and drops the connection when instanceManager.getDB() returns a different instance (by reference) or the same instance reports isDetached(). query()'s message-matching retry (which never fired) is removed — detection now happens before the query is issued, not after a catch that never triggers.

Why #287 didn't work

Verified against the real @duckdb/duckdb-wasm source (not assumed):

async postTask(e, r=[]) {
  if (!this._worker) {
    console.error("cannot send a message since the worker is not set!:" + e.type + "," + e.data);
    return;   // resolves undefined — never rejects
  }
  ...
}

The "worker is not set" string is only ever console.error'd, never thrown. runQuery() resolves undefined, and AsyncDuckDBConnection.query() then does RecordBatchReader.from(undefined), which throws a generic TypeError: Cannot read properties of undefined (reading 'peek') — not an error containing "worker is not set" either way. #287's _isStaleWorkerError() message check in query()'s catch block had nothing to catch. Reproduced this directly with a minimal repro against the installed package.

Why isDetached() alone (my first attempt at fixing this) also doesn't work

instanceManager.terminateDB() clears the underlying engine state, so the next getDB() call constructs a brand-new AsyncDuckDB object (confirmed against devrev-web's real DuckDbInstanceManager: this.duckDBInstance = new AsyncDuckDB(logger, this.mainWorker) on every cold boot). That new object always reports isDetached() === false — checking isDetached() on the object getDB() just returned tells you nothing about whether the cached connection (bound to the previous, now-dead object) is stale.

The actual staleness signal is identity: does the engine getDB() returns now differ from the engine the cached connection was created against? isDetached() remains a secondary check for the case where the same engine instance transitions to detached without a new object being minted.

Tests

Rewrote dbm.spec.ts's stale worker self-heal block (previous version tested the non-functional message-matching path):

  • reconnects when getDB() returns a different engine instance than the cached connection is bound to
  • reconnects when the same engine instance reports detached
  • reuses the cached connection when the engine is unchanged and not detached

All 21 dbm.spec tests pass; full meerkat-dbm suite (113) green; nx build + nx lint meerkat-dbm clean (0 errors).

Notes

work-item: ISS-334477

…essage

#287's self-heal never fired: duckdb-wasm's AsyncDuckDB.postTask on a
detached instance only console.errors "worker is not set" and resolves
undefined — it never throws, so the message-matching catch in query()
had nothing to catch. Traced and reproduced: connection.query() on an
undefined runQuery() result throws a generic
"Cannot read properties of undefined (reading 'peek')" TypeError instead,
which doesn't match the pattern either way.

Also, isDetached() on the engine returned by a fresh getDB() is
insufficient by itself: terminateDB() clears state so the next getDB()
constructs a brand-new AsyncDuckDB object, and that new object always
reports isDetached() === false. The signal that actually matters is
whether the engine bound to the cached connection differs (by reference)
from what getDB() returns now.

_getConnection() now tracks connectionDb (the engine instance a cached
connection is bound to) and drops the connection when getDB() returns a
different instance, or when the same instance reports isDetached() (the
same-engine case, e.g. build up before terminate observed).

Removed the dead message-matching retry in query() since detection now
happens before the query is issued.

Bumps meerkat-dbm 0.1.45 -> 0.1.46.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@shriram-devrev

Copy link
Copy Markdown
Contributor Author

Superseded by the mid-query-race fix (guards shutdown timer on _isBusy() + adds dispose()). The isDetached/identity approach here reconnects after a completed teardown but does not prevent the shutdown timer from terminating the worker mid-query, which is the actual production race. See the new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant