feat: engine hardening — agent-legible errors, budget timeout, limit truncation metadata by JinBa1 · Pull Request #16 · JinBa1/java-query-engine

JinBa1 · 2026-06-13T00:31:49Z

Hardens the engine for agent consumers ahead of the REST gateway: errors an LLM can self-correct from, a timeout that holds when no tuples flow, and a truncation flag agents can trust.

Agent-legible errors

New ErrorCode enum (PARSE_ERROR, UNSUPPORTED_SQL, UNKNOWN_TABLE, UNKNOWN_COLUMN, TYPE_MISMATCH, BUDGET_EXCEEDED, DATA_ERROR, INTERNAL) on QueryExecutionException — the future REST layer maps codes to HTTP statuses without parsing message text.
Unknown columns/tables now say what is available: Column 'student.nam' not found. Available: student.age, student.cid, ... / Table 'Studnt' not found. Available tables: Course, Student. Message construction is centralized (PlanContext.unknownColumn, DBCatalog.unknownTable) so wording cannot drift between detection paths.
QueryPlanner.planQuery no longer swallows failures into a generic "query could not be planned": JSqlParser syntax messages surface (trimmed of the expected-token dump), non-SELECT statements get a clear rejection, and planner exceptions propagate. Paths that threw bare RuntimeException/UnsupportedOperationException (column resolution, ORDER BY/GROUP BY/SELECT item validation, unsupported operators like OR) are converted — OR previously died with an opaque EmptyStackException.
Unknown table in FROM was a latent NPE in ScanOperator; now UNKNOWN_TABLE with the available list.
BlazeDB.run gains a catch-all so internal bugs exit 1 instead of crashing through main, and partial output is deleted on any runtime failure.

Budget timeout reaches emission-starved phases

QueryBudget checked the clock only inside charge(), i.e. on tuple emission — a query stuck in a long blocking build that emits nothing escaped the timeout. New QueryBudget.checkDeadline() (time-only, no tuple count, starts the clock if first) is called per iteration in the five dark loops: sort buffering, aggregate accumulation, hash-join build, distinct dedup, and the hash-join probe's bucket scan. Tuple-budget semantics and all existing budget tests are untouched.

Token-aware limit metadata

LimitOperator returned null identically for "child exhausted" and "cap hit". It now peeks the child once after the cap: wasTruncated() is exact (no false positive on exactly-N results). Failures during the probe — budget exhaustion or a bad row past the cap — read as truncated instead of killing the already-complete result. BlazeDB.execute returns QueryResult {rows, truncated, hint} for callers; the CLI prints a Rows: line; the output file stays pure data. total_estimate is deferred until real statistics exist.

Verification

382/382 tests pass (mvn test), including new suites: AgentLegibleErrorsTest, BlockingDrainTimeoutTest, LimitTruncationTest, QueryResultTest.
All 20 sample queries byte-identical (SampleQueryRunner: 20/20 PASSED).

Known follow-up

Unqualified column references (SELECT a FROM T) still NPE inside the engine (pre-existing; caught by the new CLI catch-all but not yet agent-legible). Candidate for a small follow-up: a plan-time qualified-name check mirroring the existing aggregate-argument guard.

QueryBudget checked the clock only in charge(), i.e. on tuple emission, so a query stuck in a blocking build that emits nothing escaped the timeout until the next tuple. New checkDeadline() (time-only, no tuple count, starts the clock if first) is called per iteration in the five dark loops: sort buffering, aggregate accumulation, hash-join build, distinct dedup, and the hash-join probe bucket scan.

User-triggered failures now throw QueryExecutionException with an ErrorCode (PARSE_ERROR, UNSUPPORTED_SQL, UNKNOWN_TABLE, UNKNOWN_COLUMN, TYPE_MISMATCH, BUDGET_EXCEEDED, DATA_ERROR, INTERNAL) and a message an LLM agent can self-correct from: unknown columns/tables list what is actually available. Previously the worst paths bypassed the clean error channel entirely: column/table misses threw bare RuntimeException (raw stack trace) or UnsupportedOperationException, which planQuery's catch(Exception) swallowed into a generic 'query could not be planned' — JSqlParser syntax messages included. An unknown table in FROM was a latent NPE in ScanOperator. planQuery now propagates instead of swallowing, the parser message is surfaced (trimmed of its expected-token dump), and unsupported operators like OR are rejected with a message instead of an EmptyStackException. BlazeDB.run gains a catch-all so internal bugs exit 1 instead of crashing, and partial output is deleted on any runtime failure, not just QueryExecutionException.

LimitOperator returned null identically for 'child exhausted' and 'cap hit', so an agent getting exactly N rows could not tell a complete result from a truncated one. After emitting its cap the operator now peeks the child once: wasTruncated() is exact, with no false positive on exactly-N results. Budget exhaustion during the peek reads as truncated instead of killing the already-complete capped result. BlazeDB.execute returns a new QueryResult {rows, truncated, hint} for callers (CLI prints a Rows: line; the REST gateway consumes the record later). The output file stays pure data — all 20 sample diffs are byte-identical. total_estimate is deferred until real statistics exist (phase 5).

- Limit peek now treats any QueryExecutionException as truncated, not just budget exhaustion: a malformed row past the cap is not part of the answer and must not fail an already-complete capped result - consolidate unknown-table/unknown-column message construction into DBCatalog.unknownTable and PlanContext.unknownColumn so wording cannot drift between detection paths - fix stale PlannedQuery javadoc (root is never null now), drop dead local and commented-out fragment, document exception constructors and the attachBudget timing contract

codecov-commenter · 2026-06-13T00:32:27Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 81.01266% with 30 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...n/java/com/github/jinba1/blazedb/QueryPlanner.java	78.04%	5 Missing and 4 partials ⚠️
...c/main/java/com/github/jinba1/blazedb/BlazeDB.java	66.66%	4 Missing ⚠️
...main/java/com/github/jinba1/blazedb/DBCatalog.java	72.72%	3 Missing ⚠️
.../github/jinba1/blazedb/ExpressionPreprocessor.java	66.66%	1 Missing and 2 partials ⚠️
...m/github/jinba1/blazedb/operator/ScanOperator.java	62.50%	2 Missing and 1 partial ⚠️
...in/java/com/github/jinba1/blazedb/PlanContext.java	80.00%	1 Missing and 1 partial ⚠️
...hub/jinba1/blazedb/operator/AggregateOperator.java	33.33%	2 Missing ⚠️
...m/github/jinba1/blazedb/operator/JoinOperator.java	0.00%	2 Missing ⚠️
...in/java/com/github/jinba1/blazedb/StringValue.java	0.00%	1 Missing ⚠️
...om/github/jinba1/blazedb/operator/Accumulator.java	66.66%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

JinBa1 · 2026-06-13T00:39:09Z

@claude review this PR

claude · 2026-06-13T00:39:21Z

Claude finished @JinBa1's task in 5m 37s —— View job

PR Review: `feat/engine-hardening`

Summary: Solid hardening pass — the three pillars (agent-legible errors, blocking-phase timeout, truncation metadata) are coherent and well-scoped. All 382 tests pass, the ErrorCode/QueryExecutionException plumbing is correct, and the centralized error-construction pattern (PlanContext.unknownColumn, DBCatalog.unknownTable) is a good design choice. A few issues worth addressing before merge:

Bugs / Correctness

1. LimitOperator.peekForMore() silences INTERNAL errors

peekForMore() catches all QueryExecutionException, including ErrorCode.INTERNAL (engine bugs). If a bug manifests exactly on the peek-past-the-cap tuple, the exception is swallowed and truncated = true is returned — hiding a real bug from the caller.

// LimitOperator.java ~L60
} catch (QueryExecutionException e) {
    // INTERNAL errors are silently swallowed here
    truncated = true;
}

The comment correctly describes the intent for DATA_ERROR/BUDGET_EXCEEDED, but catching INTERNAL goes too far. Consider rethrowing internal errors:

} catch (QueryExecutionException e) {
    if (e.code() == ErrorCode.INTERNAL) throw e;
    truncated = true;
}

Fix this →

Design Issues

2. QueryResult factory methods are package-private

QueryResult.complete() and QueryResult.truncated() have no visibility modifier (package-private). The REST gateway mentioned in the PR description will live in a different package and won't be able to call these. Should be public static.

// QueryResult.java L15
static QueryResult complete(long rows) {       // should be public static
static QueryResult truncated(long rows) {      // should be public static

Fix this →

3. run() discards execute()'s return value

BlazeDB.java line 83 calls execute(rootOp, args[2]) as a void statement, silently discarding the QueryResult. Truncation metadata is already printed inside execute(), so CLI users still see it — but any future programmatic wrapper that calls run() gets no structured result back. At minimum, consider returning QueryResult from run() rather than just an int exit code.

Minor Issues

4. Implicit long overflow in checkDeadline()

// QueryBudget.java ~L65
deadlineNanos = System.nanoTime() + timeoutMs * 1_000_000;

timeoutMs is Long; the multiplication unboxes to long * int. For values above ~9.2 × 10¹² ms (impractical, but a maintenance trap), this overflows silently. Use 1_000_000L to make the arithmetic explicit:

deadlineNanos = System.nanoTime() + timeoutMs * 1_000_000L;

5. ExpressionPreprocessor error message format diverges from PlanContext.unknownColumn()

When a column isn't found during preprocessing (line ~182):

// ExpressionPreprocessor.java
"Column '" + columnName + "' not found in " + tableName + ". Available: ..."

vs. PlanContext.unknownColumn():

"Column '" + tableName + "." + columnName + "' not found. Available: ..."

Two different formats for the same UNKNOWN_COLUMN error, breaking the "single construction site" principle the PR correctly establishes elsewhere. The preprocessor should call PlanContext.formatColumns(DBCatalog.getInstance().getDBSchemata(tableName)) but wrap the message via the same helper, or reuse ctx.unknownColumn() directly.

Test Coverage Gaps (matching Codecov report)

6. JoinOperator — 0% on new lines: The INTERNAL path ("Could not retrieve schema for join") is an engine-invariant guard and realistically untestable in isolation, but a unit test using null-returning mock schemas would close the gap if desired.

7. ScanOperator — 37.5% on new lines: The new null-check if (tablePath == null || columnTypes == null) fires when a table referenced in FROM doesn't exist. AgentLegibleErrorsTest covers this indirectly but the Codecov miss suggests the ScanOperator constructor path isn't reached in the new tests — worth verifying that the unknownTableInFrom test actually triggers this branch vs. being caught earlier in ExpressionPreprocessor.

What's Well Done

parserMessage() — clean trimming of the "Was expecting..." token dump from JSQLParser. Exactly what agent consumers need.
ExpressionEvaluator OR guard — catching unsupported operators before evaluating children (preventing the EmptyStackException) is the right fix and the comment explains why.
peekForMore() exactly-N correctness — calling the child once past the cap and checking for non-null correctly avoids false-positive truncation when the child has exactly limit rows.
checkBudgetDeadline() placement — covering all five dark loops (sort, aggregate, dedup, hash-join build, hash-join bucket scan) is thorough.
Centralized error construction — PlanContext.unknownColumn() and DBCatalog.unknownTable() single-site pattern prevents wording drift. Issue chore(deps-dev): bump org.apache.maven.plugins:maven-resources-plugin from 3.0.2 to 3.5.0 #5 above is the only gap.
BlazeDB.run error ordering — catch (QueryExecutionException) before catch (RuntimeException) is correct; user errors get a clean message, engine bugs get the stack trace.

Gather context and understand the request
Read repo structure for guidelines
Review changed files (core engine, operators, tests)
Analyze for bugs, correctness, and design issues
Post comprehensive review feedback

…umn-miss wording - LimitOperator.peekForMore rethrows INTERNAL: an engine bug on the peeked tuple must not be masked as truncation metadata; user-facing failures past the cap (budget, bad row) still read as truncated - ExpressionPreprocessor takes the PlanContext and reuses ctx.unknownColumn, so WHERE-clause column misses share the single construction site (base-table name doubles as schema id) - cover previously untested error branches: unsupported SELECT/ORDER BY arithmetic items, scan row-shape DATA_ERROR after catalog init Declined from review: public QueryResult factories (consumers read the record, only execute() constructs), run() returning QueryResult (CLI exit-code contract; REST uses execute directly), 1_000_000L suffix (Long*int already promotes to long — no semantic difference).

JinBa1 · 2026-06-13T01:09:18Z

Review addressed in b87f101.

Fixed:

chore(deps): bump actions/checkout from 4 to 6 #1 peekForMore() now rethrows INTERNAL — engine bugs are never masked as truncation metadata. User-facing failures past the cap (budget, bad row) still read as truncated. Regression test: internalErrorDuringPeekPropagates.
chore(deps-dev): bump org.apache.maven.plugins:maven-resources-plugin from 3.0.2 to 3.5.0 #5 ExpressionPreprocessor now takes the PlanContext and reuses ctx.unknownColumn (a base-table name doubles as its schema id), so WHERE-clause column misses share the single construction site: Column 'Student.nam' not found. Available: sid, cid, age, grade.
chore(deps-dev): bump org.junit.jupiter:junit-jupiter-engine from 5.10.2 to 6.0.3 #7 Added real-behavior tests for the uncovered branches: unsupported SELECT/ORDER BY arithmetic items (UNSUPPORTED_SQL), scan row-shape change after catalog init (DATA_ERROR). Verified unknownTableInFrom does exercise the ScanOperator null-check: the query has no WHERE clause, so the preprocessor never runs — the remaining Codecov misses were the DATA_ERROR catch paths, now covered.

Declined, with reasoning:

chore(deps): bump actions/setup-java from 4 to 5 #2 QueryResult.complete/truncated stay package-private by design: consumers (REST included) read the record via its public accessors; only execute() constructs results. Public factories would invite construction outside execution.
chore(deps): bump actions/cache from 4 to 5 #3 run() keeps returning an exit code — that is the CLI process contract (main → System.exit, tests assert ints). The REST gateway will call planQuery + execute directly, which is exactly why execute returns QueryResult.
chore(deps): bump codecov/codecov-action from 5 to 6 #4 timeoutMs * 1_000_000 — timeoutMs is Long, so it unboxes to long and binary numeric promotion widens the int literal before the multiply. The L suffix is semantically identical; the overflow point (~292 years of ms) is unchanged either way.
chore(deps-dev): bump org.apache.maven.plugins:maven-clean-plugin from 3.1.0 to 3.5.0 #6 The join schema-null guard is an engine-invariant check the planner cannot reach; a mock-schema test would exercise the mock, not the engine.

386/386 tests, 20/20 sample queries byte-identical.

JinBa1 added 4 commits June 13, 2026 01:11

JinBa1 merged commit 38bd98c into main Jun 13, 2026
1 check passed

JinBa1 deleted the feat/engine-hardening branch June 13, 2026 11:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: engine hardening — agent-legible errors, budget timeout, limit truncation metadata#16

feat: engine hardening — agent-legible errors, budget timeout, limit truncation metadata#16
JinBa1 merged 5 commits into
mainfrom
feat/engine-hardening

JinBa1 commented Jun 13, 2026

Uh oh!

codecov-commenter commented Jun 13, 2026 •

edited

Loading

Uh oh!

JinBa1 commented Jun 13, 2026

Uh oh!

claude Bot commented Jun 13, 2026 •

edited

Loading

Uh oh!

JinBa1 commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JinBa1 commented Jun 13, 2026

Agent-legible errors

Budget timeout reaches emission-starved phases

Token-aware limit metadata

Verification

Known follow-up

Uh oh!

codecov-commenter commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

JinBa1 commented Jun 13, 2026

Uh oh!

claude Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: feat/engine-hardening

Bugs / Correctness

Design Issues

Minor Issues

Test Coverage Gaps (matching Codecov report)

What's Well Done

Uh oh!

JinBa1 commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov-commenter commented Jun 13, 2026 •

edited

Loading

claude Bot commented Jun 13, 2026 •

edited

Loading

PR Review: `feat/engine-hardening`