Skip to content

feat(llm): stream reasoning on a dedicated out-of-band event#180

Open
lin-snow wants to merge 2 commits into
mainfrom
feat/stream-reasoning-events
Open

feat(llm): stream reasoning on a dedicated out-of-band event#180
lin-snow wants to merge 2 commits into
mainfrom
feat/stream-reasoning-events

Conversation

@lin-snow

@lin-snow lin-snow commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Why

Reasoning models wrap chain-of-thought in <think>...</think>. In separated mode the LLM node strips it from the answer (#171) but then dropped it — so the chatflow "thinking" panel stays dark while streaming. This rescues that reasoning onto a dedicated out-of-band event parallel to the text path, leaving answer untouched.

Flow

flowchart TD
    M["model token stream<br/>(&lt;think&gt; may be split across chunks)"]
    M --> F["① ThinkStreamFilter.feed / finalize<br/>(reasoning.py)"]
    F -->|"FilterChunk(text, reasoning)"| S["② LLM node splits two paths<br/>_build_stream_text_events (llm/node.py)"]

    S -->|text| TXT["StreamChunkEvent<br/>selector = node/text"]
    TXT --> ANS["answer variable stream<br/>(existing logic, unchanged)"]

    S -->|reasoning| RSN["StreamReasoningEvent(chunk, is_final)<br/>no selector · out-of-band"]
    RSN --> LIFT["③ Node._dispatch lift: inject node_id<br/>(base/node.py)"]
    LIFT -->|"NodeRunReasoningChunkEvent"| EH["④ EventHandler.dispatch<br/>collect-only group, no warning spam<br/>(event_handlers.py)"]
    EH --> RSF["⑤ ResponseStreamFilter.on_event<br/>case _ passes through, never buffered into answer<br/>(response_stream.py)"]
    RSF --> DIFY["leaves graphon → dify step ②"]

    style RSN fill:#e8f0fe,stroke:#4285f4
    style LIFT fill:#e8f0fe,stroke:#4285f4
    style EH fill:#e8f0fe,stroke:#4285f4
    style RSF fill:#e8f0fe,stroke:#4285f4
Loading

The highlighted reasoning spine is the new path; the text path is its sibling and is unchanged. Carrying no selector is what lets reasoning ride ResponseStreamFilter's case _: untouched — no answer-routing logic ever touches it.

What

  • FilterThinkStreamFilter.feed/finalize now return FilterChunk{text, reasoning} instead of str, surfacing the stripped reasoning (incl. truncated residual from an unclosed <think>) instead of dropping it. split_reasoning / extract_stream_reasoning are untouched.
  • Event — new two-layer StreamReasoningEventNodeRunReasoningChunkEvent, mirroring the stream-chunk pair minus selector. Like NodeRunRetrieverResourceEvent, it's out-of-band telemetry (not a variable stream), so ResponseStreamFilter lets it ride case _: untouched.
  • Nodeseparated mode emits reasoning alongside text and sends exactly one terminal is_final marker per run that produced reasoning (truncated residual if any, else empty).

Compatibility

  • tagged mode emits no reasoning events — byte-for-byte unchanged.
  • Terminal ModelInvokeCompletedEvent.reasoning_content is unchanged.
  • Additive & independently mergeable: ignoring NodeRunReasoningChunkEvent is safe.

Tests

uv run pytest — 585 passed (added filter, node-stream, lift, and passthrough/no-warning coverage); ruff check, ruff format --check, ty check all clean.

@lin-snow lin-snow self-assigned this Jun 15, 2026
@lin-snow lin-snow added the enhancement New feature or request label Jun 15, 2026
@lin-snow lin-snow force-pushed the feat/stream-reasoning-events branch from 0f1a551 to c4c03e4 Compare June 15, 2026 06:27
@lin-snow lin-snow marked this pull request as ready for review June 16, 2026 02:59
@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 16, 2026
@lin-snow lin-snow force-pushed the feat/stream-reasoning-events branch 2 times, most recently from f64041d to aa8dfb4 Compare June 16, 2026 03:13
@lin-snow lin-snow requested review from QuantumGhost and WH-2099 June 16, 2026 03:25
@lin-snow lin-snow force-pushed the feat/stream-reasoning-events branch from aa8dfb4 to 3dda8c0 Compare June 16, 2026 03:30
Reasoning models emit chain-of-thought inside <think>...</think>. In
separated mode the LLM node strips it from the answer (#171) but then
discarded it, leaving the chatflow "thinking" panel dark while streaming.

Rescue the stripped reasoning onto a dedicated event that parallels the
text path, leaving answer untouched:

- ThinkStreamFilter.feed/finalize now return FilterChunk{text, reasoning}
  and surface the stripped reasoning instead of dropping it (residual
  truncated reasoning on an unclosed <think> is now handed out too).
- New two-layer StreamReasoningEvent -> NodeRunReasoningChunkEvent pair,
  mirroring StreamChunkEvent minus selector: it is out-of-band node
  telemetry (like NodeRunRetrieverResourceEvent), not a variable stream,
  so ResponseStreamFilter passes it through its case _ untouched.
- The LLM node emits reasoning alongside text in separated mode and sends
  exactly one terminal is_final marker per run that produced reasoning.

tagged mode emits no reasoning events and is unchanged; the terminal
ModelInvokeCompletedEvent.reasoning_content is unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@lin-snow lin-snow force-pushed the feat/stream-reasoning-events branch from 3dda8c0 to 4ea1e36 Compare June 16, 2026 04:05
@lin-snow lin-snow requested review from GareArc and wylswz and removed request for GareArc, QuantumGhost, WH-2099 and wylswz June 16, 2026 07:07
The dedicated reasoning event added in 4ea1e36 is emitted by the streaming
invoke and translated by _dispatch, but _yield_run_completion only forwarded
StreamChunkEvent | ModelPollingProgressEvent. The reasoning event fell through
to the drop branch and never reached the graph layer, so the chatflow
"thinking" panel stayed dark on the real _run() path despite the producer and
dispatch sides being wired up.

Add StreamReasoningEvent to the forwarded set and guard the seam with a
_run()-level regression test that drives separated-mode <think> reasoning end
to end (producer -> forwarder -> outputs).
@wylswz wylswz requested a review from laipz8200 June 16, 2026 08:52
@lin-snow lin-snow marked this pull request as draft June 16, 2026 13:02
@lin-snow lin-snow marked this pull request as ready for review June 17, 2026 08:52
@lin-snow lin-snow marked this pull request as draft June 17, 2026 08:52
@lin-snow lin-snow marked this pull request as ready for review June 17, 2026 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant