Skip to content

perf(electric-db-collection): compile array-column membership to @> (GIN-eligible)#1583

Open
viktor89 wants to merge 3 commits into
TanStack:mainfrom
viktor89:electric-array-column-containment
Open

perf(electric-db-collection): compile array-column membership to @> (GIN-eligible)#1583
viktor89 wants to merge 3 commits into
TanStack:mainfrom
viktor89:electric-array-column-containment

Conversation

@viktor89

@viktor89 viktor89 commented Jun 10, 2026

Copy link
Copy Markdown

perf(electric-db-collection): compile array-column membership to @> (GIN-eligible)

Summary

inArray(value, arrayColumn) currently compiles to value = ANY(arrayColumn). That's correct, but Postgres cannot use a GIN index for scalar = ANY(array_column) — only for the containment operator array_column @> ARRAY[scalar]. The two expressions are logically equivalent, so this PR emits the containment form when the array operand is a column, letting the planner index-seek instead of scanning.

Membership in a literal value list (inArray(column, [a, b, c])) is unchanged — it still compiles to = ANY, which is correct there.

Why it matters

For a low-cardinality value in an array column, the = ANY form is pathological in the common ORDER BY ... LIMIT N shape: the planner walks an ordering index and filters row-by-row, and because few rows match, it can never fill the LIMIT — so it scans the entire table.

Concrete case from a ~2M-row table with a tags text[] column and a GIN index on it:

-- before: 'ipo' = ANY(tags)  → cannot use the GIN index → full index walk (~2M rows)
-- after:  tags @> ARRAY['ipo'] → Bitmap Index Scan on the GIN index → ~7 rows
SELECT ... FROM news_item
WHERE tags @> ARRAY['ipo']
ORDER BY first_created DESC LIMIT 25;

A tag matching 7 of 2,000,000 rows went from a full-table scan to an index seek. Dense values (e.g. a tag on 25% of rows) are unaffected — the planner still picks the ordering-index walk, now adaptively, because @> exposes the GIN option rather than removing it.

Change

In compileFunction's in branch, when the RHS is a column reference (args[1].type === 'ref'), emit ${rhs} @> ARRAY[${lhs}]; otherwise keep ${lhs} = ANY(${rhs}).

if (name === `in`) {
  if (args[1]?.type === `ref`) {
    return `${rhs} @> ARRAY[${lhs}]`
  }
  return `${lhs} ${opName}(${rhs})`
}

The two shapes are distinguishable without column type info:

  • inArray(value, arrayColumn)args[1] is a ref → containment.
  • inArray(column, [literal, list])args[1] is a val= ANY.

Tests

Added an in operator describe block in sql-compiler.test.ts covering: literal value list → = ANY, array column → @>, and a mixed compound (@> AND = ANY) to confirm both paths coexist with correct param indices.

Notes

  • Result rows are identical; this is purely an index-eligibility (and therefore planner) improvement.
  • No public API change — existing inArray callers benefit transparently.

Summary by CodeRabbit

  • Performance

    • Array membership checks now compile into a Postgres-friendly form, enabling index-backed lookups and improving query performance.
  • Tests

    • Added comprehensive tests for array membership compilation, parameter binding/sequencing, combined cases, and edge conditions.
  • Bug Fix

    • Membership operator now validates null/undefined inputs and rejects invalid array-on-array membership.
  • Documentation

    • Updated release metadata to reflect the change.

…GIN-eligible)

inArray(value, arrayColumn) compiled to `value = ANY(arrayColumn)`, which
Postgres cannot satisfy with a GIN index. The equivalent containment form
`arrayColumn @> ARRAY[value]` can, so the planner index-seeks instead of
scanning — a large win for low-cardinality values where ORDER BY ... LIMIT
otherwise walks the whole table.

Only the array-*column* case changes (RHS is a column ref); membership in a
literal value list (inArray(column, [a, b])) still compiles to = ANY.
@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 47d9042b-3ad8-40fe-ab82-237e880fc30f

📥 Commits

Reviewing files that changed from the base of the PR and between 18ea875 and b4ef909.

📒 Files selected for processing (2)
  • packages/electric-db-collection/src/sql-compiler.ts
  • packages/electric-db-collection/tests/sql-compiler.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/electric-db-collection/tests/sql-compiler.test.ts
  • packages/electric-db-collection/src/sql-compiler.ts

📝 Walkthrough

Walkthrough

The SQL compiler now emits arrayColumn @> ARRAY[value] when in checks membership against an array column reference; literal-list membership still compiles to = ANY(...). Tests and a changeset note were added covering both forms and error cases.

Changes

Array column containment optimization

Layer / File(s) Summary
Compiler branch for in with array-column
packages/electric-db-collection/src/sql-compiler.ts
compileFunction special-cases in when the RHS is a column ref to emit arrayCol @> ARRAY[...]; in is added to isComparisonOp so comparison-style null/undefined validation applies and array-left-literal usage against an array column is rejected.
Tests and changeset metadata
packages/electric-db-collection/tests/sql-compiler.test.ts, .changeset/array-column-containment.md
New in operator tests for literal-list (= ANY) and array-column (@> ARRAY) compilation, param ordering across combined expressions, empty/single lists, and null/undefined/invalid-array-left errors; changeset updated to document the containment compilation behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hopped through code with careful paws,
Swapped ANY for @> without a pause,
Tests lined up and params in a row,
Indexes wake when the queries go,
A tiny change — the searchlights glow.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: compiling array-column membership checks to use the @> operator for GIN index eligibility, directly matching the PR's primary performance optimization objective.
Description check ✅ Passed The description provides comprehensive context on the change, motivation, implementation details, and test coverage; however, the checklist section is incomplete—no checkboxes are marked for testing or changeset generation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint install timed out. The project may have too many dependencies for the sandbox.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@viktor89 viktor89 marked this pull request as ready for review June 10, 2026 17:56
@viktor89 viktor89 marked this pull request as draft June 10, 2026 18:04

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/electric-db-collection/tests/sql-compiler.test.ts`:
- Around line 159-189: Add tests to cover corner cases for the new in operator
behavior in the sql-compiler.test.ts suite: add cases for empty-array (e.g.,
val([]) against ref status or roles), single-element-array (val([x]) and ARRAY
with single value) and null/undefined operands (val(null), val(undefined), and
ref that can be null/undefined) using compileSQL and func('in', [...]) to assert
the compiled where clause and params match the intended behavior; include both
permutations where the literal/array is the left operand and where the column
ref is the left operand (e.g., func('in', [ref('status'), val([])]), func('in',
[val('admin'), ref('roles')]) etc.) so the tests validate EMPTY, single-element,
and null/undefined handling for both "= ANY" and "@> ARRAY" code paths.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2a9cbd23-926b-4f0b-bf80-e1a1bec0aba2

📥 Commits

Reviewing files that changed from the base of the PR and between 4d1abde and 6903ad0.

📒 Files selected for processing (3)
  • .changeset/array-column-containment.md
  • packages/electric-db-collection/src/sql-compiler.ts
  • packages/electric-db-collection/tests/sql-compiler.test.ts

Comment thread packages/electric-db-collection/src/sql-compiler.ts
Comment thread packages/electric-db-collection/tests/sql-compiler.test.ts
@viktor89 viktor89 marked this pull request as ready for review June 10, 2026 18:07
Address review: add empty-array, single-element, and null/undefined cases.
`in` now goes through the same null-operand guard as the other comparison
operators, so inArray(null/undefined, ...) throws instead of emitting an
unbound parameter.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/electric-db-collection/src/sql-compiler.ts (1)

359-367: ⚠️ Potential issue | 🟠 Major

Tighten/guard inArray(value, arrayColumn) so value can’t be array-valued when arrayColumn is a column ref

inArray is declared as inArray(value: ExpressionLike, array: ExpressionLike) and just constructs Func('in', ...), so the public types don’t prevent passing an array val(...) as the first argument. In packages/electric-db-collection/src/sql-compiler.ts, the in branch for args[1]?.type === 'ref' emits rhs @> ARRAY[${lhs}], which wraps an array-valued lhs into a nested/single-element array and can break the intended “contains these elements” semantics. Add a type refinement (scalar-only value when array is a column ref) or a runtime check that rejects an array-valued args[0] in this scenario.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/electric-db-collection/src/sql-compiler.ts` around lines 359 - 367,
The `inArray(value, array)` public API allows an array-valued `value` but the
SQL emitter in sql-compiler.ts (the `if (name === 'in')` branch) assumes a
scalar left-hand side when `args[1]?.type === 'ref'` and emits `${rhs} @>
ARRAY[${lhs}]`, which breaks semantics for array-valued `lhs`; update the `in`
handling so it validates/refines the first arg: in the `in` branch (same place
where `args[1]?.type === 'ref'` is checked) add a runtime guard that detects
array-valued `args[0]` (e.g., `args[0].type === 'val'` with an array payload or
any internal array marker) and either throw a clear error or coerce to the
correct SQL form, or tighten the public `inArray` typing so callers cannot pass
array `value` when `array` is a column ref; reference the inArray factory and
the `name === 'in'` emission logic to implement the check and fail fast if an
array-valued `args[0]` is used with a column `args[1]`.
🧹 Nitpick comments (1)
packages/electric-db-collection/tests/sql-compiler.test.ts (1)

206-216: 💤 Low value

Consider testing null/undefined in the RHS position for completeness.

The current tests verify that func('in', [val(null), ref('roles')]) and func('in', [val(undefined), ref('roles')]) throw. The validation logic in sql-compiler.ts (lines 181-183) uses findIndex to catch null/undefined in any argument position, so func('in', [ref('status'), val(null)]) and func('in', [ref('status'), val(undefined)]) will also throw. Adding explicit tests for those cases would provide symmetric coverage and guard against future refactoring that might break the any-position behavior.

Optional additional test cases
+      it(`should throw for null in RHS position`, () => {
+        expect(() =>
+          compileSQL({ where: func(`in`, [ref(`status`), val(null)]) }),
+        ).toThrow(`Cannot use null/undefined value with 'in' operator`)
+      })
+
+      it(`should throw for undefined in RHS position`, () => {
+        expect(() =>
+          compileSQL({ where: func(`in`, [ref(`status`), val(undefined)]) }),
+        ).toThrow(`Cannot use null/undefined value with 'in' operator`)
+      })
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/electric-db-collection/tests/sql-compiler.test.ts` around lines 206
- 216, Tests only check null/undefined in the LHS of func('in') but the
compiler's validation (compileSQL) rejects null/undefined in any argument
position; add symmetric tests asserting compileSQL throws for func('in',
[ref('status'), val(null)]) and func('in', [ref('status'), val(undefined)]) so
that compileSQL's behavior is covered for RHS operands as well.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@packages/electric-db-collection/src/sql-compiler.ts`:
- Around line 359-367: The `inArray(value, array)` public API allows an
array-valued `value` but the SQL emitter in sql-compiler.ts (the `if (name ===
'in')` branch) assumes a scalar left-hand side when `args[1]?.type === 'ref'`
and emits `${rhs} @> ARRAY[${lhs}]`, which breaks semantics for array-valued
`lhs`; update the `in` handling so it validates/refines the first arg: in the
`in` branch (same place where `args[1]?.type === 'ref'` is checked) add a
runtime guard that detects array-valued `args[0]` (e.g., `args[0].type ===
'val'` with an array payload or any internal array marker) and either throw a
clear error or coerce to the correct SQL form, or tighten the public `inArray`
typing so callers cannot pass array `value` when `array` is a column ref;
reference the inArray factory and the `name === 'in'` emission logic to
implement the check and fail fast if an array-valued `args[0]` is used with a
column `args[1]`.

---

Nitpick comments:
In `@packages/electric-db-collection/tests/sql-compiler.test.ts`:
- Around line 206-216: Tests only check null/undefined in the LHS of func('in')
but the compiler's validation (compileSQL) rejects null/undefined in any
argument position; add symmetric tests asserting compileSQL throws for
func('in', [ref('status'), val(null)]) and func('in', [ref('status'),
val(undefined)]) so that compileSQL's behavior is covered for RHS operands as
well.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a82cf245-4531-485f-b7bd-89020fc496f7

📥 Commits

Reviewing files that changed from the base of the PR and between 6903ad0 and 18ea875.

📒 Files selected for processing (2)
  • packages/electric-db-collection/src/sql-compiler.ts
  • packages/electric-db-collection/tests/sql-compiler.test.ts

… against an array column

The `arrayColumn @> ARRAY[value]` form wraps a single scalar; passing an
array value would nest into `ARRAY[<array>]` and silently change the
membership semantics. Throw a clear error instead, and cover it with a test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant