feat(search): trigram fuzzy + concept recall (typo+semantic, zero regression)#31
Open
fedster99 wants to merge 1 commit into
Open
feat(search): trigram fuzzy + concept recall (typo+semantic, zero regression)#31fedster99 wants to merge 1 commit into
fedster99 wants to merge 1 commit into
Conversation
Closes the two zero-scoring eval categories (typo, semantic) without touching the four already at 0.91–1.00. Two pure-Postgres, recall-only branches added to the free-text path (search/expand.ts + compile.ts): - Fuzzy (typo): significant query tokens match via pg_trgm word_similarity over the trigram-indexed subject/sender/recipient columns from 0007, driven through OPERATOR(extensions.<%) with pg_trgm.word_similarity_threshold set to 0.4 per-statement (SET LOCAL) inside the read-only transaction. - Concept (semantic): a curated, general email/business concept thesaurus widens the tsquery (primary || synonyms), so an intent word retrieves mail that never says it literally (vacation -> travel). Deterministic, zero-dependency Tier-1.5; Phase-2 embeddings remain the durable answer. Safety: an is_primary flag (any exact lexical hit) is the leading ORDER BY key, so every exact match ranks strictly above any fuzzy/concept-only match. The recall branches can add results a keyword search misses but can never reorder the ones it already gets right — zero regression by construction. Eval (pnpm eval:search), before -> after: headline nDCG@10 0.731 -> 0.953, recall@10 0.724 -> 0.953 typo 0.00 -> 0.65 semantic 0.00 -> 0.98 lexical 0.91 -> 0.995 operator/ranking/phrase/email-intent 1.00 (held) Verified: pnpm typecheck, pnpm test (182 passed incl. 12 new expand unit tests), pnpm build, pnpm test:db:live (118/118 incl. the search-quality quality gate against the real engine). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Closes the two zero-scoring eval categories — typo and semantic — by adding two pure-Postgres, recall-only branches to the free-text search path, fused strictly below exact lexical matches. Stacks on #26 (the search-layer + eval + email-intel stack).
Why this shape
Reading the eval corpus showed the two gaps are different problems:
invioce,secuirty,candiate,metrcs) = misspellings of indexed terms → trigram fuzzy.vacation,hiring,breach,expense) = concept words with no literal overlap → query concept-expansion (not person-resolution).So this is the roadmap's Phase 1 / Tier 1.5 (
docs/search-eval.md), done as the smallest eval-justified slice — no embeddings, no new deps.How
apps/api/src/search/expand.ts(new):significantTerms(fuzzy tokens) + a curated, general email/businessCONCEPT_THESAURUS(expandConcepts).word_similarityover the trigram-indexed subject/sender/recipient columns from0007, viaOPERATOR(extensions.<%)so thegin_trgm_opsGINs drive it;pg_trgm.word_similarity_threshold = 0.4set per-statement withSET LOCALin the read-only txn.primary || synonyms) so an intent word retrieves mail that never says it literally.is_primaryflag (any exact lexical hit) is the leadingORDER BYkey, so exact matches always rank above fuzzy/concept-only matches. The recall branches add misses; they can never reorder existing hits. The partial-indexdeleted_in_provider = falseliteral is preserved.Result —
pnpm eval:search, before → afterResidual typo misses are deliberate scope edges (a body-only term — body trigram is unindexed for perf; a transposition below the 0.4 threshold whose third judged doc has no literal token), left unchased to avoid overfitting the threshold to a 31-doc synthetic corpus. Open-ended semantics beyond the curated thesaurus remain Phase-2 (opt-in pgvector).
Verification
pnpm typecheck✓ ·pnpm build✓pnpm test→ 182 passed (incl. 12 newsearch-expandunit tests) ✓pnpm test:db:live→ 118/118, incl.search-quality.live-db.test.ts(the quality gate) andsearch.live-db.test.tsagainst the real engine ✓Harness Impact
docs/search-eval.mdupdated: Phase 1 marked shipped with the measured delta and the goal marked met. No AGENTS.md / ADR change needed (additive to the0015engine; no schema change).🤖 Generated with Claude Code