Skip to content

DatabricksAdapter yields text content during tool-calling rounds, causing response duplication #421

@cuchaga

Description

@cuchaga

DatabricksAdapter yields text content during tool-calling rounds, causing response duplication

Problem

When using DatabricksAdapter with Claude models via Databricks Model Serving (OpenAI-compatible endpoint), the model produces substantial text content alongside tool calls in the same response turn. The adapter's streamCompletion() method yields all delta.content text as message_delta events regardless of whether the same response also contains tool calls.

In a multi-step ReAct loop, this causes the model's intermediate text (which often includes the full answer) to be emitted and displayed to the user on every tool-calling turn — resulting in the same answer appearing 3-4× in the final output.

Root Cause Analysis

Claude's native API uses a trained-in tool-use system prompt that shapes the model to produce brief commentary before tool_use blocks. The Databricks OpenAI-compatible translation layer does not include this trained-in prompt, so Claude produces more verbose text during tool-calling rounds.

The adapter loop (run() method, line ~196) correctly breaks when no tool calls are present, but during tool-calling rounds, streamCompletion() yields ALL delta.content as message_delta events — there's no mechanism to suppress or buffer text that arrives in a turn that also contains tool calls.

Expected Behavior

Text content from tool-calling rounds should either:

  1. Be suppressed (not yielded as message_delta) when the same response also contains tool_calls
  2. Be buffered until streamCompletion returns, and only yielded if no tool calls were detected
  3. Be marked with metadata so consumers can distinguish intermediate text from final-answer text

Current Workaround

We added:

  • A system prompt instruction telling the model to only write brief status notes during tool-calling turns
  • Client-side text buffering in the SSE transport layer that discards accumulated text when a function_call event arrives in the same round

Environment

  • AppKit version: 0.38.1
  • Model: databricks-claude-opus-4-6 via Model Serving
  • Adapter: DatabricksAdapter.fromServingEndpoint()

Reproduction

  1. Create an agent with maxSteps: 10 and multiple tools (e.g., describeTable, executeSql)
  2. Ask a question that triggers 3+ tool calls
  3. Observe that the assistant's text response contains the full answer repeated once per tool-calling round

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions