fix(meerkat-dbm): detect stale worker by engine identity, not error message#289
Closed
shriram-devrev wants to merge 1 commit into
Closed
fix(meerkat-dbm): detect stale worker by engine identity, not error message#289shriram-devrev wants to merge 1 commit into
shriram-devrev wants to merge 1 commit into
Conversation
…essage #287's self-heal never fired: duckdb-wasm's AsyncDuckDB.postTask on a detached instance only console.errors "worker is not set" and resolves undefined — it never throws, so the message-matching catch in query() had nothing to catch. Traced and reproduced: connection.query() on an undefined runQuery() result throws a generic "Cannot read properties of undefined (reading 'peek')" TypeError instead, which doesn't match the pattern either way. Also, isDetached() on the engine returned by a fresh getDB() is insufficient by itself: terminateDB() clears state so the next getDB() constructs a brand-new AsyncDuckDB object, and that new object always reports isDetached() === false. The signal that actually matters is whether the engine bound to the cached connection differs (by reference) from what getDB() returns now. _getConnection() now tracks connectionDb (the engine instance a cached connection is bound to) and drops the connection when getDB() returns a different instance, or when the same instance reports isDetached() (the same-engine case, e.g. build up before terminate observed). Removed the dead message-matching retry in query() since detection now happens before the query is issued. Bumps meerkat-dbm 0.1.45 -> 0.1.46. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Contributor
Author
|
Superseded by the mid-query-race fix (guards shutdown timer on _isBusy() + adds dispose()). The isDetached/identity approach here reconnects after a completed teardown but does not prevent the shutdown timer from terminating the worker mid-query, which is the actual production race. See the new PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Corrects #287.
_getConnection()now tracks the engine instance (connectionDb) a cached connection is bound to, and drops the connection wheninstanceManager.getDB()returns a different instance (by reference) or the same instance reportsisDetached().query()'s message-matching retry (which never fired) is removed — detection now happens before the query is issued, not after a catch that never triggers.Why #287 didn't work
Verified against the real
@duckdb/duckdb-wasmsource (not assumed):The "worker is not set" string is only ever
console.error'd, never thrown.runQuery()resolvesundefined, andAsyncDuckDBConnection.query()then doesRecordBatchReader.from(undefined), which throws a genericTypeError: Cannot read properties of undefined (reading 'peek')— not an error containing "worker is not set" either way. #287's_isStaleWorkerError()message check inquery()'s catch block had nothing to catch. Reproduced this directly with a minimal repro against the installed package.Why isDetached() alone (my first attempt at fixing this) also doesn't work
instanceManager.terminateDB()clears the underlying engine state, so the nextgetDB()call constructs a brand-newAsyncDuckDBobject (confirmed against devrev-web's realDuckDbInstanceManager:this.duckDBInstance = new AsyncDuckDB(logger, this.mainWorker)on every cold boot). That new object always reportsisDetached() === false— checkingisDetached()on the objectgetDB()just returned tells you nothing about whether the cached connection (bound to the previous, now-dead object) is stale.The actual staleness signal is identity: does the engine
getDB()returns now differ from the engine the cached connection was created against?isDetached()remains a secondary check for the case where the same engine instance transitions to detached without a new object being minted.Tests
Rewrote
dbm.spec.ts'sstale worker self-healblock (previous version tested the non-functional message-matching path):getDB()returns a different engine instance than the cached connection is bound toAll 21
dbm.spectests pass; fullmeerkat-dbmsuite (113) green;nx build+nx lint meerkat-dbmclean (0 errors).Notes
@devrev/meerkat-dbm0.1.45 → 0.1.46.work-item: ISS-334477