feat(graphile-search): replace sigmoid weighted-average with Reciprocal Rank Fusion (RRF)#1287
Open
pyramation wants to merge 7 commits into
Open
feat(graphile-search): replace sigmoid weighted-average with Reciprocal Rank Fusion (RRF)#1287pyramation wants to merge 7 commits into
pyramation wants to merge 7 commits into
Conversation
…al Rank Fusion (RRF) - Replace sigmoid normalization + weighted-average scoring with rank-based RRF - Inject ROW_NUMBER() window functions per adapter for rank computation - Normalize RRF scores to [0,1] using only active adapters in denominator - Add rrfK preset option (default 60, configurable smoothing constant) - Preserve @searchConfig weights (maps to weighted RRF contributions) - Preserve recency boost (applied as post-RRF multiplier) - Deprecate normalization field (now no-op, RRF doesn't need it) - Add comprehensive rrf-scoring.test.ts (21 tests covering single/multi adapter, chunks, weights, recency, custom rrfK) - Update search-config-integration test for deprecated normalization Closes constructive-io/constructive-planning#1047
Contributor
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
Contributor
|
Found 4 test failures on Blacksmith runners: Failures
|
- Remove normalization from SearchConfig interface (not deprecated — deleted) - Remove sigmoid strategy parameter from normalizeScore (only used as RRF fallback) - Remove deprecated normalization warning code - Delete sigmoid normalization test suite - Delete deprecated normalization no-op test from rrf-scoring - Remove normalization field from node-type-registry schema - Update graphile-search skill docs for RRF
RRF scoring can produce exactly 1.0 for rank-1 documents when all adapters agree. Changed toBeLessThan(1) to toBeLessThanOrEqual(1).
…er path ROW_NUMBER() window functions were being injected into the SELECT list for every per-adapter filter query (e.g. bm25Body, tsvTsv), but ranks are only needed for the unifiedSearch RRF composite path. The window function forced PostgreSQL to evaluate the BM25 score expression in the ORDER BY of the window, which could interact poorly with ParadeDB's BM25 index build timing. The searchScore lambda already has a fallback that estimates rank from normalized score for individual filters.
…seed The BM25 test failures were caused by a ParadeDB index race condition: after INSERTing data, the BM25 index isn't immediately queryable when a ROW_NUMBER() window function references the BM25 operator in its ORDER BY (adding computational pressure during index build). Adding VACUUM ANALYZE after seeding forces the BM25 index to be fully built before tests run. Restores ROW_NUMBER() injection in per-adapter filter path (needed for correct RRF scoring when individual filters are combined with searchScore).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces sigmoid weighted-average in
searchScorewith Reciprocal Rank Fusion (RRF) — rank-based fusion that handles BM25's unbounded scores correctly (sigmoid crushes BM25 signal: scores -20 vs -10 both map to near-zero).Ranks injected via
ROW_NUMBER() OVER (ORDER BY score ...)per adapter. Normalized by max possible RRF over active adapters only → bounded 0..1. NewrrfKpreset option (default 60).Removes
SearchConfig.normalizationfield entirely (pre-launch, no deprecation). Deletes sigmoid test suites + dead code. 21 new RRF tests cover single/multi adapter combos, chunk tables, custom weights, recency boost, custom rrfK.Also adds
VACUUM ANALYZEtomega-seed.sqlafter data insertion — forces ParadeDB's BM25 index to be fully built before tests run, fixing a timing race withROW_NUMBER()window functions.Skills updated in both repos. Related: #1047, #1050, constructive-skills#173
Link to Devin session: https://app.devin.ai/sessions/4a9f098c74fb4cb6a9b6868fcff321db
Requested by: @pyramation