Skip to content

Land $name params, ORDER BY/LIMIT subqueries, LSM shutdown durability (native layer deferred)#4

Merged
dmitrat merged 26 commits into
mainfrom
integration/witdb-contrib
Jun 13, 2026
Merged

Land $name params, ORDER BY/LIMIT subqueries, LSM shutdown durability (native layer deferred)#4
dmitrat merged 26 commits into
mainfrom
integration/witdb-contrib

Conversation

@dmitrat

@dmitrat dmitrat commented Jun 13, 2026

Copy link
Copy Markdown
Owner

Consolidates the valuable changes from #2 and #3, fixes their two real issues, and defers the unrelated NativeAOT layer to its own PR. Full functional suite green: 6315 passed, 0 failed (Category!=Performance).

What lands

  • $name SQLite parameters ([2/2] fix(parser): SQLite $name parameters (Microsoft.Data.Sqlite / EF ADO compat) #2) — WHERE col = $id (Microsoft.Data.Sqlite / EF ADO compat). Bound parameter end-to-end; injection-safe.
  • ORDER BY / LIMIT in nested subqueries (was in [1/2] ci: stabilize GitHub Actions build (ANTLR jar + flusher shutdown) #3's base, undocumented there) — enables DELETE … WHERE id NOT IN (SELECT … ORDER BY … LIMIT n). Safe superset relaxation.
  • LSM shutdown durability — Flusher/Compactor drain-before-cancel (as contributed) plus a fix to LsmParallelWriter (see below).
  • CI stabilization — vendored ANTLR tool jar, Performance benchmarks marked & excluded from the PR gate (thresholds restored).
  • IndexMetadata System.Text.Json source-gen (Core) — wire-compatible, removes a reflection dependency.
  • ADO Linux path normalization.

Fixes applied on top

  • LsmParallelWriter shutdown (12a988f) — the contributed change cancelled the token before joining the merge task (unlike Flusher/Compactor), so the final drain could run merges against a store that is already tearing down and fault awaited writes. Reworked the merge loop to exit cleanly on channel completion and reordered Dispose/DisposeAsync to Complete → join → cancel. +4 shutdown-durability tests.
  • $name prefix asymmetry (4c35690) — a $id/:id placeholder failed to resolve a value registered under a bare/@-prefixed name. Made the fallback symmetric (exact key still wins first → no wrong-bind). +5 tests.

Deferred

  • OutWit.Database.Native (NativeAOT C-ABI for pywitdb) removed here (077bfd6). It is isolated from the published packages but unrelated to this work and adds an unreviewed native subsystem; it should return as a dedicated PR with its own review and release pipeline (it ships as a native shared lib, not a NuGet package). The IndexMetadata source-gen change it introduced is kept.

Versions (all bumped to 1.1.0)

Core, Parser, Database, AdoNet, EntityFramework.

Supersedes #2 and #3 (close them in favor of this).

🤖 Generated with Claude Code

KarataevDmitry and others added 26 commits June 6, 2026 12:18
Expose witdb_open/get/put/delete/txn via NativeAOT for pywitdb ctypes; include smoke harness and witdb.h.
SQLite-compat: IN/NOT IN, EXISTS, and FROM subqueries use full queryExpression so CASA-style prune DELETE works.
Extend WitDb native C ABI for SQL helpers; refresh smoke harness and parser tests.
Multi-target net9/net10 builds race on Maven Central jar downloads in GitHub Actions (Antlr4BuildTasks #99). Point AntlrProbePath at a checked-in antlr4-4.13.1-complete.jar and assert it in CI.
AntlrProbePath file URI with parent segments did not resolve on GitHub Actions; use explicit GetFullPath to build/antlr/antlr4-4.13.1-complete.jar.
DisposeAsync cancelled workers before pending channel items were flushed, causing DisposeWaitsForPendingFlushesTest to fail on CI.
Antlr4BuildTasks reads jar path from Antlr4 item metadata; PropertyGroup alone left net10.0 with an empty jar on Linux.
Match flusher fix: wait for workers after channel complete, then cancel CTS.
Path.GetFileNameWithoutExtension failed for C:\\ paths on ubuntu-latest CI runners.
Cancellation exited the merge loop without flushing queued buffers; finally block drains pending writes.
ItemDefinitionGroup after Antlr4 items left metadata empty on clean CI builds.
PARAM_DOLLAR_NAMED is lexed before $n numbered params; evaluator and serializer round-trip $paramName for Microsoft.Data.Sqlite compat.
ADO $id placeholders no longer become @$id in execution context.
Includes Forge-style MigrationHistory WHERE MigrationId = $id scenario.
…ance

Revert CI-driven threshold relaxations to 10/12/50. Tag Level3_ConstraintValidationTests with Category Performance for selective runs.
Shared ubuntu runners are too noisy for scaling ratio benchmarks. CI runs functional tests only; perf benchmarks stay for local or dedicated runs.
The contributor PRs bundled a new OutWit.Database.Native (NativeAOT C ABI for
the pywitdb Python binding) into the shared base. It is isolated from the
published packages (IsPackable=false, not in pack/publish allow-lists, no
shipped project references it) but is unrelated to the CI/parameter work the
PRs describe, adds an unreviewed native subsystem to the solution build, and
has its own concurrency contract to settle. Removing it here so the valuable
DB-library changes can land cleanly; the native layer can return as a dedicated
PR with its own review and release pipeline (it ships as a native shared lib,
not a NuGet package).

Kept the IndexMetadata System.Text.Json source-gen change (Core) — it is
wire-compatible, removes a reflection dependency, and stands on its own.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ity)

The contributor's shutdown fix made the writer drain in a finally block but still
cancelled the token BEFORE joining the merge task (Complete -> Cancel -> Wait),
unlike LsmMemTableFlusher / LsmParallelCompactor which do Complete -> Wait -> Cancel.
The writer is the only one of the three that writes user data through the store
during the drain, so cancelling first could run the final merges against a store
that is already tearing down and fault awaited writes instead of writing them
durably - undermining the durability the change was meant to provide.

Rework the merge loop to exit cleanly when the channel is completed (WaitToReadAsync
returns false) rather than relying on cancellation, and reorder both Dispose and
DisposeAsync to Complete -> join -> cancel, with a cancel escape hatch only if the
drain join times out. This matches the proven flusher/compactor pattern: the final
merges run against a live store and every queued/awaited buffer is durably written.

Tests: 4 new shutdown-durability tests (sync/async drain of a large queued backlog,
awaited writes complete successfully across shutdown, Dispose does not hang). Full
LsmParallelWriter fixture 17/17.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…tyles

The $name work left an asymmetry: a bare caller key 'id' is normalized to '@id'
by the engine, and only an @-prefixed placeholder resolved it. A $id or :id
placeholder built the key '$id'/':id', missed the stored '@id', and the bare-name
fallback looked up 'id' (also absent) -> KeyNotFoundException. So EF/Sqlite-compat
callers mixing a $id placeholder with a bare/@-registered value hit a confusing
failure.

Extend EvaluateParameter's fallback to also try the '@'-normalized bare name, so
$name / :name / @name placeholders all resolve a value registered under the bare
name or '@name'. The exact placeholder key is still matched FIRST, so an explicit
$id binding always wins over the fallback - this cannot bind the wrong parameter
(verified by ExactDollarKeyWinsOverNormalizedFallbackTest).

Tests: 5 new engine tests (bare key, @-prefixed key, colon placeholder, exact-wins,
mixed @/$/: prefixes in one statement). Fixture 7/7.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Minor bump for the consolidated contributor release: $name SQLite parameters and
ORDER BY/LIMIT-in-subqueries (new parser capabilities), LSM shutdown durability
fixes, IndexMetadata source-gen, ADO Linux path. EntityFramework has no direct
code change but is bumped to propagate the new dependency versions downstream.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ance")]

PR #3 excluded scaling/threshold benchmarks from the CI gate but only categorized
Level3_ConstraintValidationTests. Six pure-benchmark fixtures were left
uncategorized and still ran in the gate, so they flake on shared CI runners -
InsertPerformanceTests.SingleInsertMemoryTest failed on net9.0 with
'allocated 125360, expected < 100000' (a ~25% over-threshold from runner noise,
not a regression; it passes locally and the change set does not touch the insert
allocation path).

Mark the class level of the six remaining pure-benchmark fixtures so the
'Category!=Performance' gate excludes them, matching how Level3 was handled:
InsertPerformanceTests, DmlPerformanceTests, SelectPerformanceTests,
MemoryLeakTests, ProfilingTests, Level3_SqlEngineTests. They still run locally and
in any dedicated perf workflow. Stress fixtures (correctness-under-load, no timing
thresholds) are intentionally left in the gate.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…nistic CI build)

PR #3 vendored antlr4-4.13.1-complete.jar and set %(Antlr4.AntlrToolJar) via an
ItemDefinitionGroup. But Antlr4BuildTasks.targets defines its OWN
ItemDefinitionGroup default of <AntlrToolJar></AntlrToolJar> (empty) and is
imported AFTER the project body, so IDG-vs-IDG precedence is SDK-evaluation-order
dependent: it resolved on the local SDK but came up empty on the CI SDK, which
then silently fell back to Antlr4BuildTasks' network jar download (JavaExec
PATH;DOWNLOAD). That made the build network-flaky - the first CI run downloaded
the jar and passed, the next failed with ANT02 'could not find an Antlr4 tool jar
... currently set to '''.

Move AntlrToolJar to inline metadata on the <Antlr4> Include item. Inline item
metadata always wins over any ItemDefinitionGroup regardless of import/SDK order
(the package default is an IDG, not an item Update - verified), so the committed
jar is used deterministically and no download is attempted. Verified via
'-getItem:Antlr4' and a clean ANTLR regen build.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@dmitrat dmitrat merged commit 1093db9 into main Jun 13, 2026
1 check passed
@dmitrat dmitrat deleted the integration/witdb-contrib branch June 13, 2026 15:05
KarataevDmitry pushed a commit to AI-Guiders/WitDatabase that referenced this pull request Jun 13, 2026
…stic stop)

DisposeStopsBackgroundCollectionTest flaked on main after dmitrat#4 (Expected 5, was 6):
MvccGarbageCollector.Dispose() called Timer.Dispose(), which does NOT wait for an
already-running callback. A collection cycle that started just before Dispose could
finish and bump RunCount afterwards, so 'no runs after Dispose' held only when the
timing happened to cooperate (green on the PR runner, red on main).

Use the Timer.Dispose(WaitHandle) overload and wait for the handle: it signals once
all callbacks have drained, so no background collection runs after Dispose returns
and RunCount is final. Pre-existing race in a subsystem untouched by dmitrat#4 - just
surfaced by runner timing. MVCC GC fixture 14/14, stable across repeated runs.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants