feat(chatbot): add in-page-nav attribute for cauldron testing#149
feat(chatbot): add in-page-nav attribute for cauldron testing#149yodem wants to merge 40 commits into
Conversation
…uring streaming When the agent completes a source-fetching tool call (get_text, get_english_translations), a collapsible yellow box appears so the user can start reading the reference on Sefaria while Claude finishes its response. The box expands during streaming and collapses to one line once the full reply arrives. - Add tool_input to tool_end SSE progress events (tool_runtime.py) - New SourceSuggestion.svelte component (book icon, header, Sefaria link) - LCChatbot: track firstSourcePreview state, SOURCE_PROVIDING_TOOLS set, onProgress capture logic, streaming + post-response render slots - CSS via :global() rules in root component (shadow DOM requirement) - i18n: source.readWhileWaiting, source.readOnSefaria keys (en.json)
Adds SefariaClient.search_topics() which calls api/name/{query}?type=topic
and returns [{title, slug}, ...] filtered to Topic completions only.
Creates server/chat/V2/appetizer/ package with three passing unit tests.
Also fixes pre-existing B905 ruff warning (zip without strict=).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nd tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Renders a growing list of tool-call/status progress entries with
collapse/expand toggle (collapsed=true shows "Show thinking (N steps)",
collapsed=false streams all entries live).
SVG icons reuse paths from LCChatbot.svelte for visual consistency.
NOTE: requires two i18n keys in en.json (a later task):
"progress.showThinking": "Show thinking ({count} steps)"
"progress.hideThinking": "Hide thinking ({count} steps)"
Until added, svelte-i18n will fall back to the raw key strings.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace single thinking bubble with scrolling progress trail showing all tool calls and status events. Trail collapses to toggle after answer arrives. Adds i18n keys and shadow DOM CSS. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…er event Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Show a Sefaria topic link in a yellow box within 5 seconds via the parallel Haiku appetizer pipeline. Includes collapse/expand, fade-in animation, and appetizer_click custom event for analytics. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Fix executor leak: shut down appetizer_executor in finally block 2. Fix tool_end matching: use findLastIndex by toolName instead of array position — prevents stuck spinners when status events interleave with tool calls 3. Fix stuck status entries: mark all running trail entries as complete before persisting to assistantMessage 4. Fix empty state: show "Thinking..." fallback before first SSE progress event arrives 5. Fix NONE check: strip trailing punctuation before comparing Haiku's concept extraction response 6. Fix multi-instance: use $host() instead of document.querySelector for appetizer click tracking 7. Remove dead currentProgress variable (no longer read in template) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tier 1: regex strips prompt wrappers, sends keywords to Sefaria name API (<500ms). Covers most prompts. Tier 2: Haiku extracts concept for abstract prompts (2-4s). Both run inside 5-second timeout. No Haiku cost for common queries.
The Sefaria api/name endpoint returns the Shabbat tractate (type=ref)
before the Shabbat topic. Now search_topics tries a direct slug lookup
via api/v2/topics/{slug} when the name API returns zero Topic-type
results. Also adds debug logging to the appetizer thread.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The chat.appetizer logger was not registered in Django's LOGGING config, causing appetizer logs to be silently dropped and making it difficult to debug appetizer functionality. Added chat.appetizer logger entry with handlers=['chat_console'], level='INFO', propagate=False to match other chat module loggers.
Test results from May 25, 2026 after fixing chat.appetizer logging: - TA-1: PASS — appetizer appeared at ~12.8s for 'find me sources about Shabbat' - TA-2: PASS — link href=https://www.sefaria.org/topics/shabbat, text='Shabbat →' - TA-3: PASS — appetizer fully collapsible: clicking header toggles aria-expanded and removes/adds .appetizer-body - TA-4: PASS — translation prompt ('translate Genesis 1:1') does not generate appetizer, only discovery prompts do - TA-5: PASS — appetizer persists across multiple messages, remains collapsed initially Root cause of TA-1 regression was missing 'chat.appetizer' logger config in Django LOGGING dict.
Move appetizer thread start from inside generate_sse() to right after authentication — before session creation, summary loading, and message saving. This eliminates the 2-5s setup overhead and delivers the appetizer event within ~450ms of the request. Fix TopicAppetizer link click by using window.open() explicitly instead of relying on shadow DOM <a> default navigation which was unreliable. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Record timing for each appetizer pipeline step (classification, tier-1 search, tier-2 Haiku extraction, tier-2 search) and attach as metadata.appetizer on the Braintrust trace span. Non-intrusive — metrics are collected passively and attached after the agent completes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
window.location.href navigates away from the page and destroys the chatbot. Revert to window.open(_blank) which safely opens the topic in a new tab. Same-page SPA navigation requires Sefaria host-side changes to intercept the appetizer_click custom event. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…very Many reverse proxies (nginx, gunicorn, cloud LBs) buffer the first 4-8KB before forwarding to the client. This caused SSE events (including the appetizer) to be delayed ~25 seconds on production. Send a 4KB SSE comment at the start of the stream to flush these buffers. The comment is ignored by EventSource but forces the proxy to start forwarding immediately. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Log timestamps at each stage: appetizer thread start, SSE generator start, proxy flush, first event yield. All relative to request start time for easy correlation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove target="_blank", window.open, and e.preventDefault. Use a plain <a href> so the browser's native click event propagates up through the shadow DOM. Sefaria's React app intercepts same-origin clicks and does client-side routing — the chatbot stays alive and the topic page loads in the same tab. Verified on production: clicking the topic link navigates to the topic page while the chatbot remains visible with conversation intact. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…andle /topics/ Sefaria's sefaria:bootstrap-url event handles text references (/Genesis.1.1) but silently ignores topic URLs (/topics/shabbat). Detect /topics/ paths in handleMessageLinkClick and fall back to window.open so topic links actually navigate. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…action The old tier-1 used English-only regexes + direct name API lookup. Hebrew prompts like 'תן לי מקורות על סיוון כ'' sent the raw Hebrew to the name API, where 'תן' matched 'תניא' (Tanya) — wrong topic. Now Haiku is the single entry point: it extracts the core Jewish concept from any prompt language, then the name API resolves the slug. Expected latency: ~1.5–2.5s (well under the 5s timeout). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two bugs: 1. Haiku prompt too narrow: 'if not about Jewish texts, return NONE' filtered out Sefaria topics like parenting, money, relationships. New prompt scopes to 'topics that could appear in the Sefaria library' which covers universal topics through a Jewish-text lens. 2. Appetizer never dismissed: once shown it persisted forever, even after the answer arrived — cluttering the loading experience. Now cleared immediately after the answer message is committed to state (the topic is still saved on the message object for history). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…pic card Previous commit incorrectly cleared appetizerData (the topic card). The intended behavior: clear firstSourcePreview so the source mention disappears once the answer is shown, reducing loading-experience clutter. The TopicAppetizer persists as intended. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Deleted SourceSuggestion.svelte and all references: - import, SOURCE_PROVIDING_TOOLS constant, firstSourcePreview + sourcePreviewMessageId state, tool_end population logic, both render sites (streaming + history), and all CSS. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… in trail Appetizer improvements: - Haiku now returns up to 3 ranked candidates (comma-separated) instead of a single concept. Tries each until one matches — fixes cases like 'Herod the Great' where the first candidate fails but 'Herod' hits. Multiple candidates also add natural think-time (~2-4s total). - Canonical short names prompted: 'Herod' not 'Herod the Great'. - search_topics now accepts PersonTopic and AuthorTopic in addition to Topic — 'Herod' was filtered out because it's type PersonTopic. ProgressTrail: - Tool entry descriptions with single-quoted refs (e.g. 'Pesachim 119b') are now rendered as clickable sefaria.org links. SourceSuggestion: - File restored (kept for future re-enablement, currently unused). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…e double quotes)
Refs in tool descriptions use double quotes e.g. "Mishnah Shabbat 7:2".
Previous regex only matched single quotes. Updated to use backreference
so both 'Pesachim 119b' and "Mishnah Shabbat 7:2" are linkified.
Non-ref quoted strings ("both", "en") still pass through unchanged since
refToUrl returns null for them.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Upgrade appetizer model from Haiku to Sonnet for smarter topic extraction - Return up to 3 matched topics (exhaustive search, deduplicated by slug) - Bump timeout from 5s to 8s for Sonnet inference + sequential API calls - SSE payload now sends topics array instead of single flat object - Frontend renders comma-separated topic links (no arrow, no hint) - New header: "Discover sources connected to related topics..." - Same-page navigation via sefaria:bootstrap-url event (no new tab) - Migration guard for old flat appetizerData format in localStorage - ProgressTrail refs styled bold with external-link icon - Refined prompt: prefer specific topics, distinct candidates Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sonnet generates full markdown responses with search XML tags instead of comma-separated topic names. Haiku follows the system prompt correctly and returns quality multi-topic results in ~1.5-2.5s. Keep 8s timeout for headroom. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ction Sonnet ignores system prompts and enters "assistant mode" with free-form text. Tool-forcing via tool_choice + extract_topics tool constrains output to structured data — physically impossible to produce markdown. Returns 3 quality topics in ~2-3.5s. Based on CandleKeep research: Anthropic Prompting Best Practices recommends structured output tools with tool_choice as the strongest format enforcement for Claude 4.x models (prefilling is deprecated/errors on 4.x). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ation <a> tags with href always navigate even with e.preventDefault() in shadow DOM context. Switched to <button> elements styled as links — clicking dispatches sefaria:bootstrap-url for in-page navigation without any default browser navigation to intercept. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lsewhere Svelte 5 delegated onclick doesn't reliably preventDefault on <a> tags in shadow DOM. Fix: use Svelte action (use:attachClickHandler) with direct addEventListener for reliable preventDefault. On sefaria.org: dispatches sefaria:bootstrap-url for in-page navigation. Off sefaria.org: window.open as fallback (opens topic in new tab). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Header was truncating in narrow panel. Shortened to "Discover sources on related topics". Added margin-right: 4px on comma separators so topic names don't run together. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…page navigation Previously, message-body links to /topics/* were opening in a new tab. This removes the early return so they dispatch the sefaria:bootstrap-url event, enabling in-page navigation on Sefaria just like ref links. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Filter external-domain links: open in new tab instead of dispatching sefaria:bootstrap-url with a foreign path - Filter off-Sefaria embeds: open topic/ref on sefaria.org in new tab when the host page isn't sefaria.org (mirrors handleAppetizerClick) This makes handleMessageLinkClick consistent with handleAppetizerClick, which already has the onSefaria hostname guard. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds in-page-nav boolean attribute that forces sefaria:bootstrap-url dispatch for all Sefaria links even when the host page is not on sefaria.org. Set this on cauldron/staging embeds to test topic and source in-page navigation without a production deployment. Do not deploy to production. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
The preview deployment for sefaria/ai-chatbot:server is ready. 🟢 Open app | Open Build Logs | Open Application Logs Last updated at: 2026-06-03 16:21:07 CET |
|
I'll analyze this PR systematically, reviewing the code changes, architecture, test coverage, and quality. PR OverviewThis PR implements a "Topic Appetizer" feature — a parallel pipeline that shows relevant Sefaria topic links within 5-8 seconds while the main AI agent processes the user's query. It also adds a "Progress Trail" component that shows streaming tool call progress. Key Changes
Issues Found🔴 Critical Issues1. Hardcoded Secret in Load Test Script CHATBOT_USER_TOKEN_SECRET=Mg57y8NKXyaZCQHFbFvXECUEoWhjGoUXJbbVt3H
USER_ID="cLmwgp3ikvK5B0883-1U6JIhm1ZkQFTuiX19h0PGzRdqfuo9rYCuSd9Fob6oG7Yx5tA4fuhH6SZ18PaE1S13-u8DD0oK7sZfn96Zo8INN0PI8Bfhxco4bA2d66dEJ8ePTCp2NMkbmrRd9Dny9sijiY-eKOAJcDZlNb4JPaSHlO2t9ckKbdn_IhQqPboINA=="A secret key and what appears to be an encrypted user token are hardcoded in the script. Even if this is a test/staging secret, it should not be committed to source control. The comment says "Requires: CHATBOT_USER_TOKEN_SECRET" as an env var, but then hardcodes it anyway. 2. XSS Risk in function linkifyRefs(text) {
const escaped = text.replace(/&/g, '&').replace(/</g, '<').replace(/>/g, '>');
// ...
return escaped.replace(/(['"])([^'"]+)\1/g, (match, quote, ref) => {
const url = refToUrl(ref);
if (!url) return match;
return `${quote}${icon}<a class="trail-ref-link" href="${url}" ...>${ref}</a>${quote}`;
});
}The Actually, looking more carefully: the outer 3. result = asyncio.run(appetizer_service.find_appetizer(data["text"]))Using 🟡 Medium Issues4. stream_closed = False
# ...
def run_appetizer():
if result and not stream_closed:
progress_queue.put(update, timeout=0.5)
5. Appetizer Executor Not Awaited on Error Paths executor.shutdown(wait=False)
appetizer_executor.shutdown(wait=False)The 6. Inconsistent Timeout Exception Handling except TimeoutError:Should be 7. Playwright MCP Snapshot Files Committed 8. Duplicate PNG Files 9. tool_input=tool_input,
10. function normalizeAppetizerData(raw) {
if (!raw) return null;
if (raw.topics) return raw;
return { topics: [{ topicSlug: raw.topicSlug, topicTitle: raw.topicTitle, topicUrl: raw.topicUrl }] };
}This shim handles old single-topic format. If this is for backward compatibility with persisted messages, it's fine. But it suggests the data model changed mid-development without a clean migration. Consider documenting when this shim can be removed. 🟢 Minor Issues11. 12. Looking at the component more carefully — the header is a 13. Plan Documents in 14. user-id="VrEKrbLVo8iV0AgQDGNStkXu0fWsD08bM7iYQ2AwZKRkiTrPuoBxYTPWumaIdMLw_..."An encrypted user token is committed to Test CoverageThe test file
Coverage estimate: ~85% of the new backend code is tested. The The test for Architecture AssessmentThe two-tier approach (direct API → LLM fallback) from the plan was replaced with a single LLM-first approach using Sonnet (not Haiku) with tool-forcing. This is a significant architectural change from the plan:
This is actually a better design for quality but uses a more expensive model (Sonnet vs Haiku). The 8-second timeout (vs 5-second in the plan) reflects this tradeoff. The parallel thread architecture is sound. The SummaryThis is a well-executed feature with good test coverage on the backend. The main concerns are:
Code Quality ScoreSub-scores: S (Scope) = 16/20
A (Architecture) = 15/20
I (Implementation) = 14/20
R (Risk) = 10/20
Q (Quality) = 11/15
P (Performance/Security) = 2/5
Base Score = 16 + 15 + 14 + 10 + 11 + 2 = 68 Effort Scale:
Final Score: 68 × 1.0 = 68 Code Quality Data (JSON){
"_schema": "code_quality_v5",
"total_score": 68,
"total_factors": "68 × 1.0 (Extra Large ESF) = 68",
"scope_score": 16,
"scope_factors": "8+ files across frontend/backend/config; new SSE event type; parallel thread architecture; new UI components; cross-cutting frontend-backend-external API integration; analytics event dispatch",
"architecture_score": 15,
"architecture_factors": "New parallel appetizer thread pattern; new component hierarchy (TopicAppetizer, ProgressTrail, SourceSuggestion); SSE protocol extension; slug fallback in search_topics; progress trail replaces single-bubble thinking indicator; tool-forcing pattern for structured LLM output",
"implementation_score": 14,
"implementation_factors": "asyncio.wait_for timeout wrapping; multi-candidate LLM extraction with deduplication; tool-forcing for structured output; linkifyRefs regex ref detection; findLastIndex for correct tool_end matching; normalizeAppetizerData backward compat shim; in-page navigation logic with hostname checks",
"risk_score": 10,
"risk_factors": "New parallel thread adds concurrency complexity (+3); hardcoded secrets in committed files (+4); no feature flag for appetizer (+2); tool_input exposure in tool_end events (+2); stream_closed race condition (+1); SSE protocol extension is backward compatible (-2)",
"quality_score": 11,
"quality_factors": "Good backend test coverage ~85% of new service code (+4); edge case tests: deduplication, partial hits, timeout, Hebrew (+3); feature-tests.json as integration record (+2); plan documents provide context (+1); no frontend |
Adds a "reading now" context chip at the top of the chat that displays the Sefaria text ref currently open in the reader. Clicking it navigates back to that ref via sefaria:bootstrap-url. Also removes version_language from the get_text tool description to clean up thinking step labels. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
in-page-navboolean attribute to<lc-chatbot>that forcessefaria:bootstrap-urldispatch for all Sefaria links, even when the host page is not onsefaria.orgUsage
Add
in-page-nav="true"to the chatbot embed in the Sefaria-Project cauldron branch:This branch is for cauldron testing only. The
in-page-navattribute bypasses the production hostname guard. Delete after testing.🤖 Generated with Claude Code