feat(llm): stream reasoning on a dedicated out-of-band event#180
Open
lin-snow wants to merge 2 commits into
Open
feat(llm): stream reasoning on a dedicated out-of-band event#180lin-snow wants to merge 2 commits into
lin-snow wants to merge 2 commits into
Conversation
0f1a551 to
c4c03e4
Compare
5 tasks
f64041d to
aa8dfb4
Compare
aa8dfb4 to
3dda8c0
Compare
Reasoning models emit chain-of-thought inside <think>...</think>. In separated mode the LLM node strips it from the answer (#171) but then discarded it, leaving the chatflow "thinking" panel dark while streaming. Rescue the stripped reasoning onto a dedicated event that parallels the text path, leaving answer untouched: - ThinkStreamFilter.feed/finalize now return FilterChunk{text, reasoning} and surface the stripped reasoning instead of dropping it (residual truncated reasoning on an unclosed <think> is now handed out too). - New two-layer StreamReasoningEvent -> NodeRunReasoningChunkEvent pair, mirroring StreamChunkEvent minus selector: it is out-of-band node telemetry (like NodeRunRetrieverResourceEvent), not a variable stream, so ResponseStreamFilter passes it through its case _ untouched. - The LLM node emits reasoning alongside text in separated mode and sends exactly one terminal is_final marker per run that produced reasoning. tagged mode emits no reasoning events and is unchanged; the terminal ModelInvokeCompletedEvent.reasoning_content is unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
3dda8c0 to
4ea1e36
Compare
The dedicated reasoning event added in 4ea1e36 is emitted by the streaming invoke and translated by _dispatch, but _yield_run_completion only forwarded StreamChunkEvent | ModelPollingProgressEvent. The reasoning event fell through to the drop branch and never reached the graph layer, so the chatflow "thinking" panel stayed dark on the real _run() path despite the producer and dispatch sides being wired up. Add StreamReasoningEvent to the forwarded set and guard the seam with a _run()-level regression test that drives separated-mode <think> reasoning end to end (producer -> forwarder -> outputs).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Reasoning models wrap chain-of-thought in
<think>...</think>. Inseparatedmode the LLM node strips it from the answer (#171) but then dropped it — so the chatflow "thinking" panel stays dark while streaming. This rescues that reasoning onto a dedicated out-of-band event parallel to the text path, leavingansweruntouched.Flow
flowchart TD M["model token stream<br/>(<think> may be split across chunks)"] M --> F["① ThinkStreamFilter.feed / finalize<br/>(reasoning.py)"] F -->|"FilterChunk(text, reasoning)"| S["② LLM node splits two paths<br/>_build_stream_text_events (llm/node.py)"] S -->|text| TXT["StreamChunkEvent<br/>selector = node/text"] TXT --> ANS["answer variable stream<br/>(existing logic, unchanged)"] S -->|reasoning| RSN["StreamReasoningEvent(chunk, is_final)<br/>no selector · out-of-band"] RSN --> LIFT["③ Node._dispatch lift: inject node_id<br/>(base/node.py)"] LIFT -->|"NodeRunReasoningChunkEvent"| EH["④ EventHandler.dispatch<br/>collect-only group, no warning spam<br/>(event_handlers.py)"] EH --> RSF["⑤ ResponseStreamFilter.on_event<br/>case _ passes through, never buffered into answer<br/>(response_stream.py)"] RSF --> DIFY["leaves graphon → dify step ②"] style RSN fill:#e8f0fe,stroke:#4285f4 style LIFT fill:#e8f0fe,stroke:#4285f4 style EH fill:#e8f0fe,stroke:#4285f4 style RSF fill:#e8f0fe,stroke:#4285f4The highlighted reasoning spine is the new path; the text path is its sibling and is unchanged. Carrying no
selectoris what lets reasoning rideResponseStreamFilter'scase _:untouched — no answer-routing logic ever touches it.What
ThinkStreamFilter.feed/finalizenow returnFilterChunk{text, reasoning}instead ofstr, surfacing the stripped reasoning (incl. truncated residual from an unclosed<think>) instead of dropping it.split_reasoning/extract_stream_reasoningare untouched.StreamReasoningEvent→NodeRunReasoningChunkEvent, mirroring the stream-chunk pair minusselector. LikeNodeRunRetrieverResourceEvent, it's out-of-band telemetry (not a variable stream), soResponseStreamFilterlets it ridecase _:untouched.separatedmode emits reasoning alongside text and sends exactly one terminalis_finalmarker per run that produced reasoning (truncated residual if any, else empty).Compatibility
taggedmode emits no reasoning events — byte-for-byte unchanged.ModelInvokeCompletedEvent.reasoning_contentis unchanged.NodeRunReasoningChunkEventis safe.Tests
uv run pytest— 585 passed (added filter, node-stream, lift, and passthrough/no-warning coverage);ruff check,ruff format --check,ty checkall clean.