Skip to content

fix: think token splitting#214

Merged
artus9033 merged 3 commits into
callstackincubator:mainfrom
JKobrynski:fix/think-token-splitting
Jun 9, 2026
Merged

fix: think token splitting#214
artus9033 merged 3 commits into
callstackincubator:mainfrom
JKobrynski:fix/think-token-splitting

Conversation

@JKobrynski

Copy link
Copy Markdown
Contributor

Related issue - #199

Summary

Makes the @react-native-ai/llama streaming parser resilient to split reasoning and tool-call delimiters.
Previously, the fallback stream adapter assumed markers like <think>, </think>, <tool_call>, and </tool_call> arrived as whole tokens. When tokenization split those markers across multiple streamed chunks, the parser could miss the transition entirely and leak reasoning or marker text into normal output.
This change extracts the stream state machine into a dedicated helper and updates it to buffer possible placeholder fragments, detect markers even when they are split or embedded inside a larger chunk, and preserve normal text/tool-call behavior.

Changes

  • Extract the llama stream state machine into createLlamaStreamParser
  • Replace inline doStream() parsing with the extracted parser helper
  • Buffer partial placeholder fragments across streamed chunks
  • Detect placeholders even when they appear inside a larger chunk
  • Handle split <think> / </think> delimiters correctly
  • Keep normal text streaming behavior intact when no structured markers are present
  • Preserve tool-call behavior while handling split <tool_call> / </tool_call> markers
  • Avoid duplicate emission for accumulated tool calls with ids
  • Preserve distinct no-id tool calls with identical payloads
  • Exclude package test files from packages/llama typecheck/build inputs
  • Add test-only lint override for Bun-based tests

Testing

  • Ran bun test packages/llama/src/__tests__/streamParser.test.ts
  • Ran ./node_modules/.bin/tsc --noEmit --project packages/llama/tsconfig.json
  • Ran ./node_modules/.bin/tsc --noEmit --project packages/llama/tsconfig.build.json
  • Ran bun lint
  • Verified on iOS device that previously reproduced <think> delimiter-splitting leakage no longer appears
  • Added regression coverage for:
    • split <think> / </think> markers
    • embedded markers inside a reasoning chunk
    • single-chunk <think>reasoning</think>final answer
    • split opening and closing <tool_call> markers
    • tool-call accumulation / dedupe behavior

@vercel

vercel Bot commented Jun 8, 2026

Copy link
Copy Markdown

@JKobrynski is attempting to deploy a commit to the Callstack Team on Vercel.

A member of the Team first needs to authorize it.

@JKobrynski JKobrynski marked this pull request as ready for review June 8, 2026 14:17
@artus9033 artus9033 requested a review from Copilot June 9, 2026 04:15
@artus9033 artus9033 self-assigned this Jun 9, 2026
@artus9033 artus9033 self-requested a review June 9, 2026 04:15
@artus9033 artus9033 removed their assignment Jun 9, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens @react-native-ai/llama’s streaming adapter against cases where <think> / <tool_call> delimiters are split across streamed chunks, by extracting the delimiter-handling state machine into a dedicated parser and adding buffering/placeholder detection logic.

Changes:

  • Extracted streaming parsing into createLlamaStreamParser and integrated it into LlamaLanguageModel#doStream.
  • Added buffering logic to detect <think> / </think> and <tool_call> / </tool_call> even when split or embedded in larger chunks.
  • Added Bun-based regression tests for split/embedded markers and tool-call dedupe behavior; adjusted TypeScript and ESLint configs to accommodate test setup.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
packages/llama/tsconfig.json Excludes src/**/__tests__/** from the package typecheck inputs.
packages/llama/src/streamParser.ts Introduces the extracted streaming parser/state machine with buffering + marker detection.
packages/llama/src/ai-sdk.ts Replaces inline streaming parsing logic with createLlamaStreamParser.
packages/llama/src/tests/streamParser.test.ts Adds regression coverage for split/embedded markers and tool-call behavior.
eslint.config.mjs Adds test-only ESLint setting to treat bun:test as a core module for import linting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/llama/src/streamParser.ts

@artus9033 artus9033 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after 1 comment resolved, great job!

@artus9033 artus9033 merged commit 8d7fb07 into callstackincubator:main Jun 9, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants