Context
FastLED AutoResearch on Windows with an ESP32-S3 on COM22 (VID:PID 303A:1001) can get into a state where fbuild deployment succeeds and the fbuild-backed serial monitor receives device output, but JSON-RPC writes do not produce any firmware response. The client then reports only a generic RPC timeout.
Observed with FastLED using fbuild 2.3.13:
bash autoresearch esp32s3 --all --skip-lint --timeout 240 --upload-port COM22
The deploy completed and the monitor received firmware output such as:
RESULT: {"type":"status","ready":true,"uptimeMs":...}
RESULT: {"chip":"ESP32-S3 (Xtensa)","type":"ready",...}
AutoResearch then sent JSON-RPC pings through fbuild.api.SerialMonitor.write():
{"method":"ping","params":[{}],"id":1}
No REMOTE: response was observed. The client retried, reset via fbuild, retried again, and still timed out.
A direct fbuild probe also did not produce an RPC response:
fbuild serial probe read COM22 --seconds 20 --send '{"method":"ping","params":[{}],"id":77}\n'
It printed only:
ESP-ROM:esp32s3-20210327
probe read: port=COM22 baud=115200 family=Esp32NativeUsbCdc dtr=false rts=false seconds=20
This should be surfaced as an actionable fbuild serial/line-control/write-path failure instead of an undifferentiated higher-level RPC timeout.
Code paths that look suspicious:
crates/fbuild-daemon/src/handlers/websockets.rs opens WebSocket serial-monitor sessions with open_port(&port, baud_rate, &client_id, None). None falls back to (DTR=true, RTS=true) in SharedSerialManager::open_port, even though family_for_vid_pid(0x303A, *) maps ESP native USB CDC to (false, false).
crates/fbuild-daemon/src/handlers/operations/monitor.rs and post-deploy monitor attach in deploy.rs also pass None to open_port.
crates/fbuild-python/src/serial_monitor.rs::write() sends a Write frame and waits for only the next WebSocket frame. If a serial Data frame arrives before WriteAck, it returns 0 and leaves the ack to be consumed later by unrelated reads.
crates/fbuild-serial/src/manager.rs::write_to_port() uses serial.write(data) once and acks the returned byte count. It should either write the full buffer or report a partial write as a failed/incomplete write.
Proposal
Make fbuild's daemon-backed serial monitor robust enough that AutoResearch and other clients can distinguish these cases:
- The monitor opened the port with the wrong board-family DTR/RTS state.
- The write was only partially accepted by the OS serial handle.
- The WebSocket write ack raced with serial data frames.
- The device is still in ROM/download/boot state after attach/reset and is not running the expected firmware.
Concrete implementation direction:
- Infer
BoardFamily for daemon WebSocket and HTTP monitor opens from the selected OS port's VID/PID, matching fbuild serial probe read behavior. For deploy flows, prefer carrying the known board/platform family through to the post-deploy monitor attach rather than falling back to None.
- Log the inferred family plus DTR/RTS values on every daemon monitor attach.
- Change
SerialMonitor.write() to keep reading until it sees the matching WriteAck or a timeout/error, while preserving/interleaving serial Data frames safely instead of treating the first non-ack frame as write failure.
- Change
write_to_port() to use write_all() or an explicit full-buffer loop, and fail loudly on partial writes/timeouts.
- Add diagnostics in the write failure/timeout path that include requested byte count, ack byte count, port, inferred family, and current DTR/RTS policy.
Acceptance criteria
- WebSocket
SerialMonitor attach for VID:PID 303A:* opens with BoardFamily::Esp32NativeUsbCdc semantics (DTR=false, RTS=false) unless the caller explicitly overrides it.
- HTTP monitor/post-deploy monitor paths no longer pass
None when the board family can be inferred from the port or deploy context.
SerialMonitor.write() succeeds when a Data frame arrives before WriteAck; add a regression test for Data -> WriteAck ordering.
SerialMonitor.write() does not silently return success/zero on partial writes; callers can tell whether all bytes were accepted.
- AutoResearch-style RPC failures report a fbuild serial diagnostic when the write/attach path is suspect, rather than only timing out waiting for
REMOTE:.
Open questions
- Should the Python
SerialMonitor constructor accept an optional board/family argument for ambiguous VID/PID cases, or should the daemon infer from VID/PID plus the deploy context only?
- Should write acknowledgements carry a request id so the Python client can match acks deterministically even with interleaved serial data?
Related issues
Context
FastLED AutoResearch on Windows with an ESP32-S3 on
COM22(VID:PID 303A:1001) can get into a state where fbuild deployment succeeds and the fbuild-backed serial monitor receives device output, but JSON-RPC writes do not produce any firmware response. The client then reports only a generic RPC timeout.Observed with FastLED using fbuild
2.3.13:The deploy completed and the monitor received firmware output such as:
AutoResearch then sent JSON-RPC pings through
fbuild.api.SerialMonitor.write():{"method":"ping","params":[{}],"id":1}No
REMOTE:response was observed. The client retried, reset via fbuild, retried again, and still timed out.A direct fbuild probe also did not produce an RPC response:
It printed only:
This should be surfaced as an actionable fbuild serial/line-control/write-path failure instead of an undifferentiated higher-level RPC timeout.
Code paths that look suspicious:
crates/fbuild-daemon/src/handlers/websockets.rsopens WebSocket serial-monitor sessions withopen_port(&port, baud_rate, &client_id, None).Nonefalls back to(DTR=true, RTS=true)inSharedSerialManager::open_port, even thoughfamily_for_vid_pid(0x303A, *)maps ESP native USB CDC to(false, false).crates/fbuild-daemon/src/handlers/operations/monitor.rsand post-deploy monitor attach indeploy.rsalso passNonetoopen_port.crates/fbuild-python/src/serial_monitor.rs::write()sends aWriteframe and waits for only the next WebSocket frame. If a serialDataframe arrives beforeWriteAck, it returns0and leaves the ack to be consumed later by unrelated reads.crates/fbuild-serial/src/manager.rs::write_to_port()usesserial.write(data)once and acks the returned byte count. It should either write the full buffer or report a partial write as a failed/incomplete write.Proposal
Make fbuild's daemon-backed serial monitor robust enough that AutoResearch and other clients can distinguish these cases:
Concrete implementation direction:
BoardFamilyfor daemon WebSocket and HTTP monitor opens from the selected OS port's VID/PID, matchingfbuild serial probe readbehavior. For deploy flows, prefer carrying the known board/platform family through to the post-deploy monitor attach rather than falling back toNone.SerialMonitor.write()to keep reading until it sees the matchingWriteAckor a timeout/error, while preserving/interleaving serialDataframes safely instead of treating the first non-ack frame as write failure.write_to_port()to usewrite_all()or an explicit full-buffer loop, and fail loudly on partial writes/timeouts.Acceptance criteria
SerialMonitorattach forVID:PID 303A:*opens withBoardFamily::Esp32NativeUsbCdcsemantics (DTR=false,RTS=false) unless the caller explicitly overrides it.Nonewhen the board family can be inferred from the port or deploy context.SerialMonitor.write()succeeds when aDataframe arrives beforeWriteAck; add a regression test forData -> WriteAckordering.SerialMonitor.write()does not silently return success/zero on partial writes; callers can tell whether all bytes were accepted.REMOTE:.Open questions
SerialMonitorconstructor accept an optionalboard/familyargument for ambiguous VID/PID cases, or should the daemon infer from VID/PID plus the deploy context only?Related issues
SerialMonitor.write()failure mode where writes targeted a closed port after post-deploy attach.