Skip to content

fix(paimon): improve filter pushdown and preserve remaining filters#679

Open
Weixin-Xu wants to merge 4 commits into
bytedance:mainfrom
Weixin-Xu:paimon-translator-constant-rhs-pushdown
Open

fix(paimon): improve filter pushdown and preserve remaining filters#679
Weixin-Xu wants to merge 4 commits into
bytedance:mainfrom
Weixin-Xu:paimon-translator-constant-rhs-pushdown

Conversation

@Weixin-Xu

@Weixin-Xu Weixin-Xu commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Preserve untranslated Paimon filters as remaining filters, fold constant RHS expressions before predicate translation, and support string LIKE predicates.

Add translator coverage for remaining filters, constant RHS pushdown, subfield filter round trips, and string LIKE cases.

What problem does this PR solve?

Issue Number: close #685

Type of Change

  • 🐛 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 🚀 Performance improvement (optimization)
  • ⚠️ Breaking change (fix or feature that would cause existing functionality to change)
  • 🔨 Refactoring (no logic changes)
  • 🔧 Build/CI or Infrastructure changes
  • 📝 Documentation only

Description

  • Split top-level AND filters into conjuncts and translate each conjunct independently.
  • Push translated conjuncts into Paimon as native predicates.
  • Preserve untranslated conjuncts as remaining filters and evaluate them after reading batches.
  • Build the Paimon read schema with additional columns required by remaining filters.
  • Fold constant RHS expressions with ExpressionEvaluator before translating literals.
  • Add string LIKE predicate support for two-argument LIKE; keep three-argument LIKE with explicit escape as remaining filter.
  • Add translator and connector tests for residual filters, partial pushdown, constant RHS expressions, LIKE predicates, and round-trip subfield filter behavior.

Performance Impact

  • No Impact: This change does not affect the critical path (e.g., build system, doc, error handling).

  • Positive Impact: I have run benchmarks.

    Click to view Benchmark Results
    Paste your google-benchmark or TPC-H results here.
    Before: 10.5s
    After:   8.2s  (+20%)
    
  • Negative Impact: Explained below (e.g., trade-off for correctness).

Release Note

Please describe the changes in this PR

Release Note:

Release Note:
- Improved Paimon filter pushdown by preserving untranslated conjuncts as remaining filters.
- Added constant RHS expression folding and string LIKE predicate support for Paimon filter translation.

Checklist (For Author)

  • I have added/updated unit tests (ctest).
  • I have verified the code with local build (Release/Debug).
  • I have run clang-format / linters.
  • (Optional) I have run Sanitizers (ASAN/TSAN) locally for complex C++ changes.
  • No need to test or manual test.

Breaking Changes

  • No

  • Yes (Description: ...)

    Click to view Breaking Changes
    Breaking Changes:
    - Description of the breaking change.
    - Possible solutions or workarounds.
    - Any other relevant information.
    

@Weixin-Xu Weixin-Xu force-pushed the paimon-translator-constant-rhs-pushdown branch 2 times, most recently from 1685567 to c984a0e Compare June 29, 2026 08:21
@Weixin-Xu Weixin-Xu requested a review from guhaiyan0221 June 29, 2026 08:21
@Weixin-Xu Weixin-Xu force-pushed the paimon-translator-constant-rhs-pushdown branch from c984a0e to 6114b8f Compare June 29, 2026 08:23
@Weixin-Xu Weixin-Xu changed the title fix: improve Paimon filter pushdown fix(paimon): improve filter pushdown and preserve remaining filters Jun 29, 2026
Preserve untranslated Paimon filters as remaining filters, fold constant RHS expressions before predicate translation, and support string LIKE predicates.

Add translator coverage for remaining filters, constant RHS pushdown, subfield filter round trips, and string LIKE cases.
@Weixin-Xu Weixin-Xu force-pushed the paimon-translator-constant-rhs-pushdown branch from 6114b8f to 5dae07b Compare July 1, 2026 08:50
Weixin-Xu added 2 commits July 1, 2026 20:45
Reuse the common ExprToSubfieldFilter parser when an evaluator is available and keep the Paimon direct filter path as fallback.

Propagate the evaluator through the Paimon scan path and move top-level conjunction flattening into the shared expression filter utilities.
Collect top-level field accesses backed by InputTypedExpr so remaining filters can read non-output columns during Paimon scans.

Flatten top-level conjuncts before converting Paimon predicates to subfield filters so evaluator-backed extraction can push down each supported leaf.
@Weixin-Xu Weixin-Xu force-pushed the paimon-translator-constant-rhs-pushdown branch 2 times, most recently from 03ba1f2 to bf2135d Compare July 2, 2026 12:42
@Weixin-Xu Weixin-Xu force-pushed the paimon-translator-constant-rhs-pushdown branch from bf2135d to 28d61e0 Compare July 3, 2026 03:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Paimon connector does not preserve residual filters and misses pushdown

2 participants