fix: install Claude Code CLI in Docker image by dcschreiber · Pull Request #13 · Sefaria/ai-chatbot

dcschreiber · 2026-02-04T06:46:11Z

Summary

Fixes the "Claude Code not found" error on deployed server
The claude-agent-sdk Python package spawns the Claude CLI as a subprocess
The Dockerfile now installs Node.js and @anthropic-ai/claude-code

Root Cause

The claude-agent-sdk is a Python wrapper around the Claude Code CLI. It uses shutil.which("claude") to find the CLI and spawns it with --output-format stream-json. Without the CLI installed, the agent service fails.

Changes

Added to the server stage of the multi-stage Dockerfile:

RUN apk add --no-cache nodejs npm \
    && npm install -g @anthropic-ai/claude-code

Test plan

Docker build succeeds
Claude CLI is accessible in container (/usr/local/bin/claude)
CLI version check works (claude --version → 2.1.31)
Deploy to dev and verify endpoints work

🤖 Generated with Claude Code

fix: add releaserc file

fix: tag string for build checkout

Test Infrastructure: - Add 255 tests covering router, guardrails, models, serializers, tool executor, prompt service, and reason codes - Add test_settings.py with SQLite in-memory database for tests - Configure pytest-asyncio for async test support - Add pyproject.toml with pytest and ruff configuration Developer Experience: - Add CLAUDE.md with project overview and development standards - Add .pre-commit-config.yaml with ruff and eslint hooks - Add setup.sh for one-command environment setup - Add start.sh for launching backend + frontend Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Create ARCHITECTURE.md with detailed system design - Document flows, components, data models, and API endpoints - Add system flow diagram and directory structure - Reference from CLAUDE.md Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

@Traced

Removed files: - server/chat/logging/ - Custom BraintrustLogger module that was never called in the current codebase (only used in claude_service_old.py) - server/chat/agent/claude_service_old.py - Old agent implementation, replaced by current claude_service.py The current implementation uses Braintrust's native @Traced decorator for tracing, which is simpler and provides automatic span management. Updated README.md directory structure and views.py docstring to reflect that Braintrust tracing uses the native SDK decorators. Note: BraintrustLog and ToolCallEvent database models are retained for potential future use with local logging backup. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Comprehensive plan for restructuring trace logging to follow Braintrust best practices for eval-ready data: - Structured input with query + messages array (enables "Try prompt" UI) - Structured output with response + tool_calls + was_refused - Proper metadata organization (session, model config, routing, context) - Tags for categorical filtering (flow, channel, environment) - Channel/site tracking for multi-platform analytics Implementation broken into 7 discrete tasks that can be executed independently. Plan includes verification checklist and migration notes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

PR_DESCRIPTION.md: - Added Braintrust logging cleanup section BRAINTRUST_RESTRUCTURE_PLAN.md: - Added reference URLs at top for quick access - Added client_version to metadata (from context.clientVersion) - Clarified prompt versions are already tracked (no work needed) - Condensed plan from 414 to 290 lines for clarity - Added verification checklist item for client_version - Improved task descriptions with exact line numbers Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Enables refusal logging by ensuring last_user_message is available before the early return. This is a prerequisite for Task 7 in the Braintrust restructure plan (adding logging to refused requests). Also updates the restructure plan with: - TL;DR explaining string→object change - Correct page_type extraction (subdomain, /texts=home) - OpenAI message format rationale - Fix for last_user_message ordering - Updated references (official cookbook vs provided example) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Restructures trace logging to follow Braintrust best practices: - Structured input: {query, messages[]} instead of truncated string - Structured output: {response, refs[], tool_calls[], was_refused} - Tags: flow type + environment for filtering in Braintrust UI - Page context: site, page_type, page_url for traffic segmentation - Refusal logging: Previously invisible, now logged with full context Adds helper functions: - extract_page_type(): Parse Sefaria URLs to identify page types - extract_refs(): Extract Sefaria refs from tool calls Skipped tasks 1 & 8 (channel field) - Slack bot uses separate MCP architecture and doesn't go through this API. See docs/BRAINTRUST_RESTRUCTURE_PLAN.md for full implementation details. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Reorganized PR_DESCRIPTION.md with summary list at top - Simplified BRAINTRUST_RESTRUCTURE_PLAN.md to reflect what was implemented - Marked restructure plan as temp doc to remove after merge Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Consolidate redundant individual tests into parametrized tests - Move shared fixtures to module level to reduce duplication - Remove verbose docstrings that restated test names - Add explicit return type annotations - Replace imperative loops with list comprehensions All 297 tests pass with ~1,000 fewer lines of code. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add hooks/pre-commit to run ruff check + format on staged Python files - Update setup.sh to auto-install git hooks - Update start.sh to verify hooks are installed - Add ruff to requirements.txt - Apply ruff --fix and ruff format across all Python files - Configure ruff ignores: E501, E402 (Django setup), B023 (async false positives) - Fix bare except -> except Exception in tool_executor.py - Fix blind Exception -> IntegrityError in test_models.py Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add CI workflow running tests on push/PR to main/dev - SQLite job for fast feedback - PostgreSQL job (with service container) to catch DB-specific issues - Add test_settings_postgres.py with sensible local defaults Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Remove SQLite job from CI (redundant with PostgreSQL) - Add TESTING.md documenting local vs CI testing strategy - Reference TESTING.md from CLAUDE.md Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The refusal logging code was accessing route_result.safety.reason_codes, but SafetyResult only has 'allowed' and 'refusal_message' fields. The reason_codes field belongs to RouteResult. This would cause an AttributeError when any request was refused. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Tests _create_refusal_response method to ensure: - Refusal responses are created correctly - Reason codes are extracted from route_result.reason_codes This covers a gap in test coverage that allowed the previous AttributeError bug to go undetected. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

@Traced

These models were designed for local logging backup but were never wired up to production code - only used in tests. - Removed BraintrustLog model (was for eval-ready log storage) - Removed ToolCallEvent model (was for tool call tracking) - Added migration 0004 to drop both tables - Updated README.md to remove from Database Models table - Added test troubleshooting note to CLAUDE.md The current implementation uses Braintrust's native @Traced decorator for tracing, which sends data directly to Braintrust. Local persistence is handled by ChatMessage which captures similar metrics. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Prevents "port already in use" errors by automatically killing any processes using ports 8001 and 5173 before starting backend/frontend. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Excludes node_modules, dist, venv, caches, logs, and lock files to reduce token usage when Claude Code indexes the codebase. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Dev tooling, test infrastructure, and Braintrust logging restructure confirmed by Akiva

Braintrust sdk

Add /api/v2/chat/anthropic endpoint that accepts and returns Anthropic Messages API format. This enables calling the agent from Braintrust playground and running evaluations with datasets. - New endpoint reuses existing ClaudeAgentService - Transforms requests/responses to Anthropic format - Supports content blocks and multi-turn messages - Includes 24 tests covering helpers and endpoint Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add eval script that can test the Anthropic-compatible endpoint using Braintrust's Eval framework. Supports: - Running against local or remote endpoints - Using Braintrust datasets or sample data - AutoEvals scorers (Factuality, AnswerRelevance) when available - Custom scorers for keyword matching and content validation Usage: BRAINTRUST_API_KEY=<key> python -m braintrust eval chat/evals/eval_anthropic_endpoint.py Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add TestChatAnthropicHTTPIntegration class with 9 tests covering full HTTP request-response cycle (JSON serialization, URL routing, headers) - Tests include Hebrew/Unicode handling, large messages, error formats - Remove chat/evals/ directory (using Braintrust UI for evals) - Fix flaky test_prompt_service test that depended on env vars - Add testing guideline to CLAUDE.md Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The error handler was returning str(e) which could leak sensitive internal information. Now returns a generic "Internal server error" message while still logging the full exception for debugging. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add APIKey model for service authentication with SHA-256 hashed storage - Add service_id field to ChatSession and ChatMessage (nullable user_id) - Add database constraint ensuring at least one identity is set - Create auth module with Actor dataclass and authenticate_request() - Support API key auth (Authorization: Bearer) and user token auth - Extract shared services (chat_service, session_service) for code reuse - Add X-Session-ID header support for multi-turn conversations - Add session ownership validation for security - Update turn logging service to accept Actor instead of user_id - Add comprehensive test coverage (222 tests passing) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Support API key auth via Authorization: Bearer header - Support user token auth via userId query param - Services filter messages by service_id, users by user_id - Enforce session ownership (services only see their sessions) - Add comprehensive tests for auth, messages, and pagination Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Migration was created prematurely - removing until ready. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Remove API key authentication, keeping only user token auth. - Remove APIKey model and service_id fields from models - Simplify Actor to only support user_id - Remove InvalidAPIKey exception and related error handling - Update all views to use user token auth only - Update tests to use user token authentication Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Accept userId via X-User-Id header to maintain Anthropic API structure compatibility. Header takes precedence over body userId field. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Changed from X-User-Id to X-Api-Key header for Anthropic standard compliance - Reverted history endpoint security changes to minimize diff scope - Removed test_history_endpoint.py (was testing reverted feature) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add AnthropicRequestSerializer to validate request format consistently with chat_stream_v2. Replaces manual validation with DRF serializer. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Make singleton agent service thread-safe with double-check locking - Remove debug log that exposed user_id - Add stricter validation for content blocks in extract_user_message - Move hardcoded model default to settings.DEFAULT_MODEL - Clean up redundant `or ""` in metadata.get() Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Revert get_agent_service() to create new instance each call (original behavior) - Remove DEFAULT_MODEL setting (keep hardcoded default) - Keep: removed debug log, stricter content validation, redundant `or ""` cleanup Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Remove run_agent_turn() which was never called - Remove unused imports (Callable, AgentResponse, ConversationMessage, get_agent_service) - Remove validate_session_ownership from exports (only used internally) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The ownership validation was ineffective because update_or_create overwrites the user_id before we check it. Now we query the existing session first and validate ownership before any update occurs. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add Anthropic-compatible endpoint for Braintrust integration

The claude-agent-sdk Python package is a wrapper around the Claude Code CLI - it spawns `claude` as a subprocess. The deployed server was failing with "Claude Code not found" because the Dockerfile only installed Python dependencies. This adds Node.js and the @anthropic-ai/claude-code npm package to the server stage of the multi-stage build. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

BrendanGalloway and others added 30 commits January 15, 2026 14:28

fix: add releaserc file

8017b3b

Merge pull request #2 from Sefaria/pipeline

8eb513a

fix: add releaserc file

fix: tag string for build checkout

955687d

Merge pull request #3 from Sefaria/pipeline

d29b7ec

fix: tag string for build checkout

Added plugins

3add8db

docs: Add architecture documentation

9e8bf60

- Create ARCHITECTURE.md with detailed system design - Document flows, components, data models, and API endpoints - Add system flow diagram and directory structure - Reference from CLAUDE.md Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Merge branch 'dev' into daniel-init-playground

780b7db

docs: Remove skipped task references from docs

22101aa

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Core: removed. Temp PR description empty.

c89f4cb

docs: Add testing documentation and simplify CI

9db9529

- Remove SQLite job from CI (redundant with PostgreSQL) - Add TESTING.md documenting local vs CI testing strategy - Reference TESTING.md from CLAUDE.md Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

superpowers plugin

834527a

fix: Kill existing processes on ports before starting servers

0ce5187

Prevents "port already in use" errors by automatically killing any processes using ports 8001 and 5173 before starting backend/frontend. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

chore: Add .claudeignore to exclude build artifacts

c28fdc7

Excludes node_modules, dist, venv, caches, logs, and lock files to reduce token usage when Claude Code indexes the codebase. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

chore: update docker

1177cac

Merge pull request #4 from Sefaria/daniel-init-playground

7f50376

Dev tooling, test infrastructure, and Braintrust logging restructure confirmed by Akiva

chore: add user to run docker

8bd9557

chore: add temp

6260265

akiva10b and others added 30 commits January 28, 2026 21:54

chore: move to v2

8979ce3

feat: update flows

ef4ab8b

chore: cleanup

0749e73

chore: use braintrust sdk

fdf752f

chore: fix braintrust fixes and cleanup

f63ef28

Merge pull request #10 from Sefaria/braintrust-sdk

a9e20df

Braintrust sdk

fix: restore missing extract_refs function in v2 to enable tests to pass

f849643

Add preference for prose-style plans in CLAUDE.md

655831d

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

docs: expand testing guideline in CLAUDE.md

864635c

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

style: auto-format files touched by linter

f40bfe0

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

chore: remove service auth migration

515f19a

Migration was created prematurely - removing until ready. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

feat: support X-User-Id header authentication for Anthropic endpoint

e4349ad

Accept userId via X-User-Id header to maintain Anthropic API structure compatibility. Header takes precedence over body userId field. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

clean cluade md line

77d0365

docs: add Anthropic API deviations note to module docstring

7344e90

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

docs: remove auth from deviations list (X-Api-Key is standard)

90ce7c1

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

refactor: use serializer for Anthropic endpoint validation

433e8b8

Add AnthropicRequestSerializer to validate request format consistently with chat_stream_v2. Replaces manual validation with DRF serializer. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Merge pull request #12 from Sefaria/adjust-endpoint-for-braintrust

8e5c7cf

Add Anthropic-compatible endpoint for Braintrust integration

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: install Claude Code CLI in Docker image#13

fix: install Claude Code CLI in Docker image#13
dcschreiber wants to merge 93 commits into
mainfrom
fix/dockerfile-install-claude-cli

dcschreiber commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants