Skip to content

Round-2 review follow-ups: GeoIP end-to-end routing, quiche RESET/EOF semantics, cleanups#54

Merged
Itsusinn merged 23 commits into
mainfrom
claude/serene-robinson-280bef
Jul 2, 2026
Merged

Round-2 review follow-ups: GeoIP end-to-end routing, quiche RESET/EOF semantics, cleanups#54
Itsusinn merged 23 commits into
mainfrom
claude/serene-robinson-280bef

Conversation

@Itsusinn

@Itsusinn Itsusinn commented Jul 2, 2026

Copy link
Copy Markdown
Member

Overview

Follow-up to #53 (merged): the remaining fixes from the round-2 code review, plus two larger items that #53 deferred. 12 commits, each self-contained.

GeoIP/GeoSite routing works end-to-end now

GEOIP / GEOSITE rules were a silent no-op in the running server: tuic-server routed via wind-core's AclRouter (builds its MatchContext without geo lookups), and wind-acl's now-geodata-capable AclEngine had no production user. Geo rules compiled but never matched — their traffic fell through to the default outbound (fail-open).

  • wind-acl: AclEngineBuilder::rules() to feed pre-lowered wind_core rules directly.
  • tuic-server: new [geodata] config (geosite/geoip .dat paths), compiled into a cache under data_dir at startup; TuicRouter switched from AclRouter to AclEngine, wired with the database. Guards (drop_loopback/drop_private) stay in TuicRouter; AclEngine preserves first-match-wins, so non-geo routing is unchanged.
  • Test: a GEOIP,CN,reject rule rejects a CN IP through the same wiring TuicRouter uses (and is a no-op without geodata).

quiche stream RESET/EOF semantics (M10/M11)

  • M10 — a peer RESET_STREAM was swallowed as a clean EOF, so a truncated/aborted stream looked complete. The inbound channel now carries Result<Bytes, u64>; QuicheRecv surfaces a reset as an io::ErrorKind::ConnectionReset, matching quinn.
  • M11 — a stream_send error (peer STOP_SENDING, StreamLimit, …) was only logged while poll_write kept returning Ok, silently dropping the app's writes. The stream is now marked failed and its back-channel closed, so the next poll_write fails with BrokenPipe.
  • Test: a backend-generic reset-visibility loopback test (explicit reset must surface as an error on quinn and quiche); verified it fails against the pre-M10 behavior.
  • M12 was evaluated and NOT applied. The review claimed quinn resets on drop-without-finish and quiche should match. An empirical test shows this codebase's quinn SendStream delivers a clean FIN on drop, so both backends already agree — making quiche reset on drop would have diverged from quinn, not aligned. Documented in the commit.

Low / nit cleanups

  • wind: log filter now covers wind_acl/wind_geodata/wind_quic; removed dead panicking target_addr_to_socket_addr.
  • tuic-server: removed unused compat::QuicClient.
  • wind-naive: derive SNI correctly from a bracketed IPv6 server address (was [2001 from split(':')).
  • tuic-client: parse IPv6 server addresses correctly and reject ambiguous unbracketed ones; return an error (not unwrap() panic) on double set_config; update log filter targets to the wind-tuic split.
  • wind-geodata: write the cache atomically (temp + rename).
  • wind-acme: regenerate an expired self-signed cert instead of serving it forever; bind the HTTP-01 challenge server dual-stack ([::]:80, IPv4 fallback).

Testing

Full workspace compiles (cargo check --workspace --all-targets). All touched crates' test suites pass, including new/verified tests:

  • wind-quic reset-visibility loopback (quinn + quiche), verified it catches the pre-fix behavior.
  • tuic-server geodata_routing (GEOIP matches with data, no-op without).
  • wind-naive / tuic-client IPv6 host-parsing unit tests.
  • wind-geodata out-of-bounds and round-trip tests; wind-acme renewal tests.

Reviewer notes

  • AclEngine runs the order-preserving optimizer; wind-acl's existing regression tests plus the new tuic-server GEOIP test cover semantic parity with the previous AclRouter.
  • Skipped intentionally: the tcp:// DNS scheme still opens UDP first (would need verifying hickory-internal TCP-only construction; low value vs. risk). Remaining items are cataloged in CODE_REVIEW_ROUND2.md.

🤖 Generated with Claude Code

Itsusinn added 23 commits July 2, 2026 11:05
GEOIP/GEOSITE rules were a silent no-op in the running server: tuic-server
routed via wind-core's AclRouter (which builds its MatchContext without geo
lookups), and wind-acl's now-geodata-capable AclEngine had no production
user. Geo rules compiled but never matched -- their traffic fell through to
the default outbound (fail-open).

- wind-acl: add AclEngineBuilder::rules() to accept pre-lowered wind_core
  rules directly (hosts that convert their own config), evaluated between
  apernet and clash rules.
- tuic-server: add a [geodata] config section (geosite/geoip .dat paths),
  load and compile it into a cache under data_dir at startup, and switch
  TuicRouter from wind-core's AclRouter to wind-acl's AclEngine, wired with
  the database. The existing drop_loopback/drop_private guards stay in
  TuicRouter (AclEngine's own guards are left off). AclEngine preserves
  first-match-wins semantics, so routing is unchanged for non-geo rules.

Pulling wind-geodata in transitively adds rkyv, whose blanket PartialEq
impl for SocketAddr made bare  ambiguous in the
config tests; annotate those with the concrete type.

Adds tests/geodata_routing.rs: a GEOIP,CN,reject rule rejects a CN IP via
the same AclEngine wiring TuicRouter uses (and is a no-op without geodata).

Assisted-by: Claude:claude-fable-5
…send failures

Two correctness gaps in the quiche driver's stream handling:

M10 - a peer RESET_STREAM was swallowed as a clean EOF: read_stream set
in_fin on any stream_recv error, so QuicheRecv reported EOF and a
truncated/aborted stream looked complete. The inbound channel now carries
Result<Bytes, u64>; read_stream records the reset code, flush_inbound
delivers the buffered bytes then an Err(code), and QuicheRecv surfaces it
as an io ConnectionReset error -- matching quinn's RecvStream.

M11 - a stream_send error (peer STOP_SENDING, StreamLimit, etc.) was only
logged: out_queue was cleared but the back-channel stayed open, so
QuicheSend::poll_write kept returning Ok and the app's writes were
silently dropped. The stream is now marked send_failed and its
back-channel closed (dropping the parked receiver, or dropping it when the
next write arrives), so the writer's next poll_write fails with BrokenPipe.

Adds a backend-generic reset-visibility loopback test (explicit reset must
surface as an error on quinn and quiche); verified it fails against the
pre-M10 behavior.

M12 (drop-without-finish emitting FIN vs RESET) from the review was
evaluated and NOT applied: an empirical test shows this codebase's quinn
SendStream delivers a clean FIN on drop, so both backends already agree
(clean close on drop). Making quiche reset on drop would have diverged
from quinn rather than aligned with it.

Assisted-by: Claude:claude-fable-5
The tracing Targets list omitted these crates, so their trace/debug
events fell through to the default INFO filter and --log-level trace was
silently ineffective for them (the same class of gap the list already
documents for the others).

Assisted-by: Claude:claude-fable-5
The whole util module was a single #[allow(dead_code)] function that had
no callers and would panic (blocking to_socket_addrs + expect) on domain
resolution failure if it ever were called. Remove it and its module
declarations.

Assisted-by: Claude:claude-fable-5
The compat module's QuicClient wrapper had no constructor or reference
anywhere in the crate. Remove the module and its declaration.

Assisted-by: Claude:claude-fable-5
…dress

The default server_name used server_address.split(':').next(), which
returns "[2001" for a [2001:db8::1]:443 authority -- a bogus SNI. Parse
the host properly: unwrap bracketed IPv6, otherwise strip the trailing
:port. Adds a unit test.

Assisted-by: Claude:claude-fable-5
…cketed

deserialize_server used rsplit_once(':'), so an unbracketed IPv6 literal
like "2001:db8::1" (user forgot the port) mis-split into host
"2001:db8:" + port "1" and was silently accepted, surfacing later as an
opaque connect failure. Parse bracketed "[addr]:port" explicitly and
reject ambiguous unbracketed IPv6 with a clear error. Adds a unit test.

Assisted-by: Claude:claude-fable-5
… of panicking

set_config unwrapped the OnceCell::set result, so a second call (SERVER
already initialized) panicked the caller. Return Error::Socks5 instead.

Assisted-by: Claude:claude-fable-5
The filter listed pre-split targets (tuic, tuic_quinn), so relay debug
under wind_tuic and the tuic_out/udp targets fell through to the default
INFO filter -- log_level = "debug" showed no tunnel debug. List the
crates and custom targets the client actually logs under.

Assisted-by: Claude:claude-fable-5
build_and_open wrote the cache with a plain fs::write, so a process
killed mid-write (or two concurrent builders) could leave a truncated
file that a later open() would read. Write to a pid-scoped temp file and
rename it into place (atomic on the same filesystem).

Assisted-by: Claude:claude-fable-5
ensure_self_signed_cert_files returned early whenever both files existed,
never checking validity — so the ~45-day self-signed cert was served
forever once generated. Reuse the existing pair only if the cert is still
fresh (reusing the http01 renewal helpers), otherwise regenerate.

Assisted-by: Claude:claude-fable-5
The challenge server bound IPv4-only 0.0.0.0:80 while the resolver flow
binds [::]:80, so ACME validation over IPv6 (Let's Encrypt checks AAAA
records when present) could fail. Bind [::]:80 (dual-stack on Linux),
falling back to 0.0.0.0:80 on hosts without an IPv6 stack.

Assisted-by: Claude:claude-fable-5
Re-wrap doc comments per the project's nightly rustfmt config (CI runs
cargo +nightly fmt --all -- --check). No code changes.

Assisted-by: Claude:claude-fable-5
"mis-split" tripped crate-ci/typos (reads "mis" as a typo). Reword; no
code change.

Assisted-by: Claude:claude-fable-5
…wn binaries

test_server_client_integration, test_ipv6_server_client_integration and
test_client_proxy_configuration each call tuic_client::run(), which sets
process-global connection/SOCKS state via OnceCell. Run together in one
test binary (as CI does), only the first to run could initialize the
client; the others failed with "Failed to set global wind-tuic
connection" and their SOCKS5 connect was refused. Before this repo's
tests grew real assertions those failures were invisible; now they fail.

Each `tests/*.rs` file is a separate test binary (separate process), so
move the IPv6 and proxy-config tests into their own files. The remaining
integration_tests.rs binary then runs at most one client test. No
production code changes; verified all three binaries pass.

Assisted-by: Claude:claude-fable-5
UdpSession::new applied the dual-stack option unconditionally, but
IPV6_V6ONLY only exists on IPv6 sockets. On an IPv4 SOCKS5 listener the
set_only_v6 call failed with ENOPROTOOPT ("Protocol not available",
Linux errno 92), so every UDP ASSOCIATE errored out. Windows tolerated
it, hiding the bug locally; Linux CI caught it once the client-using
integration tests were isolated enough to actually exercise the path.

Guard the option on `local_ip.is_ipv6()`, mirroring `Server::new`.

Assisted-by: Claude:claude-fable-5
QuicheSend::reset() enqueues a StreamShutdown command and then (on drop)
closes the out back-channel. The channel-close sets out_done, which made
write_stream emit a clean FIN. Because the command and the channel-close
travel on different channels, the FIN could be sent before the RESET was
applied, so the peer saw a clean EOF instead of a reset — a real bug on
the h3 (masquerade) reset path, and the cause of a flaky
quiche_reset_visibility failure on CI (Windows) despite passing locally.

Process commands before out-channel events in wait_for_data (`biased`),
and when applying a write-reset mark the send side done/failed so
write_stream never emits the racing FIN. Stress-tested 15x plus the full
loopback suite; both backends stable.

Assisted-by: Claude:claude-fable-5
The pinned quinn fork's quinn-udp fails to compile against musl's libc
(a u32/usize field-width mismatch in linux.rs), so the x86_64/aarch64
`*-unknown-linux-musl` build targets were red. Build those against glibc
instead and statically link the CRT (-C target-feature=+crt-static) to
keep self-contained binaries for the Docker images and release archives.

Renames the artifacts x86_64-linux-musl/aarch64-linux-musl ->
x86_64-linux-static/aarch64-linux-static and updates the ci.yml Docker
inputs and both Dockerfile ARG defaults to match.

NOTE: a statically-linked glibc cannot load NSS modules, so getaddrinfo
(the System DNS resolver) does not work at runtime in these builds;
configure a wind-dns network resolver (e.g. [dns] mode = "cloudflare-tls")
which resolves over the wire and bypasses NSS. Documented in target.toml.

Assisted-by: Claude:claude-fable-5
Pushing ~4 MiB through the quiche loopback drives quiche 0.29's
congestion controller into its PRR recovery path, which panics on 32-bit
targets (congestion/prr.rs overflow) — an upstream quiche limitation
unrelated to the driver buffering this test exercises. It surfaced as an
i686-windows CI failure. Gate the quiche bulk-transfer test to 64-bit;
the quinn variant still covers 32-bit, and the shared run_bulk helper
stays in use.

Assisted-by: Claude:claude-fable-5
The x86_64-linux (and other cross) release builds were failing with
"No space left on device" while linking rlibs — the ~14 GB runner disk
was exhausted by the target/ tree plus an 850M sccache dir for the
quiche/boringssl-heavy build. The sccache size was originally derived
only from GitHub's 10 GB cache cap (10 GB / 12 targets ≈ 850M), ignoring
local build disk. Drop it to 400M (4.8 GB total, still well under the
cap) to leave build headroom.

Assisted-by: Claude:claude-fable-5
This reverts commit 523792e, restoring the x86_64/aarch64 musl build
targets and the `-linux-musl` release/Docker artifact names. The musl
compile failure is instead fixed by rolling the quinn fork back to a rev
that builds on musl (next commit), which avoids the static-glibc NSS/DNS
caveat. The sccache disk fix (091a161) is kept.

Assisted-by: Claude:claude-fable-5
The bbrv3 branch tip (b87bf72a) was rebased onto a newer upstream quinn
whose quinn-udp receive-timestamps code (RECVERR cmsg handling) fails to
compile on musl targets (a u32/usize mismatch on cmsg_len), breaking the
x86_64/aarch64-linux-musl builds. Pin the lockfile precisely to ce60e5b5
(the tip before that rebase), keeping the `branch = "bbrv3"` spec so it
unifies with quinn-congestions (which also tracks bbrv3 and provides
BbrConfig) — a `rev` spec would split them into two mismatched
quinn-proto copies.

ce60e5b5's `Connecting::into_0rtt` yields `(Connection, ZeroRttAccepted)`
rather than the newer single-value form, so adapt the one match arm in
quinn/inbound.rs (the accepted-future is unused, as before). Verified the
quinn loopback, 0-RTT, and tuic e2e tests pass.

Assisted-by: Claude:claude-fable-5
Broaden the gate from 64-bit to x86_64. Off x86_64, quiche 0.29's
congestion controller is unreliable under the ~4 MiB load: it panics in
PRR recovery on 32-bit and paces slowly enough on aarch64 CI runners to
time out. Both are upstream quiche limitations, not bugs in the
architecture-independent driver buffering this test covers on x86_64.
quinn_bulk_transfer still exercises the path on every arch.

Assisted-by: Claude:claude-fable-5
@codspeed-hq

codspeed-hq Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Merging this PR will not alter performance

✅ 22 untouched benchmarks
⏩ 11 skipped benchmarks1


Comparing claude/serene-robinson-280bef (2ded734) with main (96b6947)

Open in CodSpeed

Footnotes

  1. 11 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@Itsusinn Itsusinn merged commit 50c0a2b into main Jul 2, 2026
17 checks passed
@Itsusinn Itsusinn deleted the claude/serene-robinson-280bef branch July 2, 2026 23:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant