feat: engine hardening — agent-legible errors, budget timeout, limit truncation metadata#16
Conversation
QueryBudget checked the clock only in charge(), i.e. on tuple emission, so a query stuck in a blocking build that emits nothing escaped the timeout until the next tuple. New checkDeadline() (time-only, no tuple count, starts the clock if first) is called per iteration in the five dark loops: sort buffering, aggregate accumulation, hash-join build, distinct dedup, and the hash-join probe bucket scan.
User-triggered failures now throw QueryExecutionException with an ErrorCode (PARSE_ERROR, UNSUPPORTED_SQL, UNKNOWN_TABLE, UNKNOWN_COLUMN, TYPE_MISMATCH, BUDGET_EXCEEDED, DATA_ERROR, INTERNAL) and a message an LLM agent can self-correct from: unknown columns/tables list what is actually available. Previously the worst paths bypassed the clean error channel entirely: column/table misses threw bare RuntimeException (raw stack trace) or UnsupportedOperationException, which planQuery's catch(Exception) swallowed into a generic 'query could not be planned' — JSqlParser syntax messages included. An unknown table in FROM was a latent NPE in ScanOperator. planQuery now propagates instead of swallowing, the parser message is surfaced (trimmed of its expected-token dump), and unsupported operators like OR are rejected with a message instead of an EmptyStackException. BlazeDB.run gains a catch-all so internal bugs exit 1 instead of crashing, and partial output is deleted on any runtime failure, not just QueryExecutionException.
LimitOperator returned null identically for 'child exhausted' and 'cap
hit', so an agent getting exactly N rows could not tell a complete
result from a truncated one. After emitting its cap the operator now
peeks the child once: wasTruncated() is exact, with no false positive
on exactly-N results. Budget exhaustion during the peek reads as
truncated instead of killing the already-complete capped result.
BlazeDB.execute returns a new QueryResult {rows, truncated, hint} for
callers (CLI prints a Rows: line; the REST gateway consumes the record
later). The output file stays pure data — all 20 sample diffs are
byte-identical. total_estimate is deferred until real statistics exist
(phase 5).
- Limit peek now treats any QueryExecutionException as truncated, not just budget exhaustion: a malformed row past the cap is not part of the answer and must not fail an already-complete capped result - consolidate unknown-table/unknown-column message construction into DBCatalog.unknownTable and PlanContext.unknownColumn so wording cannot drift between detection paths - fix stale PlannedQuery javadoc (root is never null now), drop dead local and commented-out fragment, document exception constructors and the attachBudget timing contract
|
@claude review this PR |
|
Claude finished @JinBa1's task in 5m 37s —— View job PR Review:
|
…umn-miss wording - LimitOperator.peekForMore rethrows INTERNAL: an engine bug on the peeked tuple must not be masked as truncation metadata; user-facing failures past the cap (budget, bad row) still read as truncated - ExpressionPreprocessor takes the PlanContext and reuses ctx.unknownColumn, so WHERE-clause column misses share the single construction site (base-table name doubles as schema id) - cover previously untested error branches: unsupported SELECT/ORDER BY arithmetic items, scan row-shape DATA_ERROR after catalog init Declined from review: public QueryResult factories (consumers read the record, only execute() constructs), run() returning QueryResult (CLI exit-code contract; REST uses execute directly), 1_000_000L suffix (Long*int already promotes to long — no semantic difference).
|
Review addressed in b87f101. Fixed:
Declined, with reasoning:
386/386 tests, 20/20 sample queries byte-identical. |
Hardens the engine for agent consumers ahead of the REST gateway: errors an LLM can self-correct from, a timeout that holds when no tuples flow, and a truncation flag agents can trust.
Agent-legible errors
ErrorCodeenum (PARSE_ERROR,UNSUPPORTED_SQL,UNKNOWN_TABLE,UNKNOWN_COLUMN,TYPE_MISMATCH,BUDGET_EXCEEDED,DATA_ERROR,INTERNAL) onQueryExecutionException— the future REST layer maps codes to HTTP statuses without parsing message text.Column 'student.nam' not found. Available: student.age, student.cid, .../Table 'Studnt' not found. Available tables: Course, Student.Message construction is centralized (PlanContext.unknownColumn,DBCatalog.unknownTable) so wording cannot drift between detection paths.QueryPlanner.planQueryno longer swallows failures into a generic "query could not be planned": JSqlParser syntax messages surface (trimmed of the expected-token dump), non-SELECT statements get a clear rejection, and planner exceptions propagate. Paths that threw bareRuntimeException/UnsupportedOperationException(column resolution, ORDER BY/GROUP BY/SELECT item validation, unsupported operators likeOR) are converted —ORpreviously died with an opaqueEmptyStackException.ScanOperator; nowUNKNOWN_TABLEwith the available list.BlazeDB.rungains a catch-all so internal bugs exit 1 instead of crashing throughmain, and partial output is deleted on any runtime failure.Budget timeout reaches emission-starved phases
QueryBudgetchecked the clock only insidecharge(), i.e. on tuple emission — a query stuck in a long blocking build that emits nothing escaped the timeout. NewQueryBudget.checkDeadline()(time-only, no tuple count, starts the clock if first) is called per iteration in the five dark loops: sort buffering, aggregate accumulation, hash-join build, distinct dedup, and the hash-join probe's bucket scan. Tuple-budget semantics and all existing budget tests are untouched.Token-aware limit metadata
LimitOperatorreturned null identically for "child exhausted" and "cap hit". It now peeks the child once after the cap:wasTruncated()is exact (no false positive on exactly-N results). Failures during the probe — budget exhaustion or a bad row past the cap — read astruncatedinstead of killing the already-complete result.BlazeDB.executereturnsQueryResult {rows, truncated, hint}for callers; the CLI prints aRows:line; the output file stays pure data.total_estimateis deferred until real statistics exist.Verification
mvn test), including new suites:AgentLegibleErrorsTest,BlockingDrainTimeoutTest,LimitTruncationTest,QueryResultTest.SampleQueryRunner: 20/20 PASSED).Known follow-up
Unqualified column references (
SELECT a FROM T) still NPE inside the engine (pre-existing; caught by the new CLI catch-all but not yet agent-legible). Candidate for a small follow-up: a plan-time qualified-name check mirroring the existing aggregate-argument guard.