Skip to content

fix: lockstep mode requires one-way latency instead of round-trip#117

Merged
gschup merged 7 commits into
mainfrom
fix/lockstep-one-way-latency
Jun 22, 2026
Merged

fix: lockstep mode requires one-way latency instead of round-trip#117
gschup merged 7 commits into
mainfrom
fix/lockstep-one-way-latency

Conversation

@gschup

@gschup gschup commented Jun 18, 2026

Copy link
Copy Markdown
Owner

The lockstep gate previously required last_confirmed_frame == current_frame, forcing both sides to be phase-locked. With input_delay=D and one-way latency L, this caused a stall/burst cycle of length L on every frame regardless of D, effectively requiring D=2L (round-trip) to run smoothly.

The fix relaxes the gate to frames_ahead <= input_delay, allowing each side to pipeline up to D committed frames ahead of confirmation. Since those frames use delayed (not predicted) inputs, no rollback is ever needed. With D=L, remote inputs arrive just in time every frame with no stalls. D=0 collapses back to the old behaviour (frames_ahead must be 0).

Implementation:

  • InputQueue: add frame_delay() getter
  • SyncLayer: add min_local_input_delay(local_handles) to read the minimum delay across all local player queues
  • P2PSession: apply new gate; extract rollback+save block into handle_rollback_and_save(); deduplicate frames_ahead calculation

Tests:

  • test_lockstep_stalls_without_remote_input: gate stalls when remote input has not arrived (no cross-poll)
  • test_lockstep_advances_with_zero_latency_zero_delay: both sessions advance every frame when cross-polled before each advance

Fixes #116

@gschup

gschup commented Jun 18, 2026

Copy link
Copy Markdown
Owner Author

@loganmc10 please take a look!

@loganmc10

Copy link
Copy Markdown

This seems to run a little better for a few seconds, but then I get this crash:

thread 'main' (1238769) panicked at /var/home/loganmc10/.cargo/git/checkouts/ggrs-5de1f5c5d7148c68/9d8104d/src/sync_layer.rs:340:9:
assertion failed: first_incorrect == NULL_FRAME || first_incorrect >= frame

@gschup

gschup commented Jun 18, 2026

Copy link
Copy Markdown
Owner Author

hmm. the more I think about it, the more I think that the pre-existing logic was correct in the first place. I wonder if there is something else I am forgetting about...

@loganmc10

Copy link
Copy Markdown

I believe your original comment is on the right track (this caused a stall/burst cycle of length L on every frame). When I printed out the confirmed frame and current frame, it seemed like the confirmed frame would jump every so often, probably related to the input delay. I test this locally by using tc/netem to simulate latency, for example:

sudo tc qdisc add dev lo root netem delay 80ms 10ms

@gschup gschup force-pushed the fix/lockstep-one-way-latency branch from 9d8104d to 18348de Compare June 18, 2026 11:32
In lockstep mode (max_prediction = 0), the input pipeline now advances
unconditionally on every advance_frame call, decoupled from the game frame.
A new lockstep_game_frame counter tracks which game frame needs confirming;
it only increments when confirmed_frame >= lockstep_game_frame, at which
point confirmed_inputs() reads the verified inputs (never predicting).

Previously the gate evaluated against sync_layer.current_frame() and used
synchronized_inputs(), which entered prediction mode for missing remote
inputs.  When the real input arrived, first_incorrect_frame was set but
lockstep never runs rollback to clear it, causing an assertion failure in
set_last_confirmed_frame.

With the fix, both sides commit pipeline-ahead packets on their first call.
After a single round-trip each side sees confirmed_frame >= game_frame 0
and can advance, so one-way latency worth of input delay is sufficient
instead of round-trip latency (fixes #116).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@gschup gschup force-pushed the fix/lockstep-one-way-latency branch from 18348de to cd06c1e Compare June 18, 2026 11:34
The advance-state block in advance_frame() now delegates to two private
methods, one per mode, keeping the top-level flow readable and the
per-mode logic self-contained.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@gschup

gschup commented Jun 18, 2026

Copy link
Copy Markdown
Owner Author

@loganmc10 another try with a wild new approach. Let me know if you have tried it out on your setup!

…rame sentinel

Instead of inferring a disconnected player from pi.frame == NULL_FRAME, read
the source of truth directly from local_connect_status.  A debug_assert
verifies the two stay in sync, catching any future divergence in confirmed_inputs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@loganmc10

Copy link
Copy Markdown

Now I get this:

thread 'main' (1283913) panicked at /var/home/loganmc10/.cargo/git/checkouts/ggrs-5de1f5c5d7148c68/67fe0f6/src/input_queue.rs:221:9:
assertion failed: self.length <= INPUT_QUEUE_LENGTH
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

The workaround seems to work:

 if p2p_session.current_frame() > p2p_session.confirmed_frame()
        && p2p_session.confirmed_frame() != ggrs::NULL_FRAME
{
    poll clients
}
advance_frame()

Could you not just have advance_frame() return an empty requests list if p2p_session.current_frame() > p2p_session.confirmed_frame()? Since advance_frame() calls poll_remote_clients, I could just have it continuously call advance_frame() until the Requests list is not empty.

Without a cap, a silent remote would fill the 128-slot input queue and panic.
The pipeline now stalls once current_frame is more than input_delay frames
ahead of lockstep_game_frame; input queuing is also skipped when capped so
the user re-submits next call, matching rollback behaviour.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@gschup

gschup commented Jun 18, 2026

Copy link
Copy Markdown
Owner Author

I tried to fix the fix. its a bit messy, but cleanup will follow after it starts working for you :D

@loganmc10

Copy link
Copy Markdown

It is better, but it still seems to add 1-2 extra frames of latency? With my current workaround, on a simulated 80ms link, I can get roughly 60FPS with 5 frames of input delay (which makes sense, since 80/16.66 is almost 5). But with this PR, I have to set the input delay to 7 to get 50 a full 60 FPS.

@gschup

gschup commented Jun 18, 2026

Copy link
Copy Markdown
Owner Author

Hmm, this is quite a tricky one to figure out. I will keep going, but I might need some time to track down the proper fix. I will ping you once I have a new try.

I hope you achieve loading and saving before I get there, though :) rollback to the future!

@loganmc10

Copy link
Copy Markdown

yes, saving/loading does work, but the problem is the graphics. The N64 used a shared memory architecture (the CPU and GPU can write to the same memory), and the graphics are emulated using Vulkan.

I don't know if I can do a good job explaining the problem, but basically, when you save (or load) the state, you need to tell the Vulkan side "finish what you are doing" to make sure that the memory is correct for the state. But this kills the performance, especially when you are possibly doing one load and many saves per frame. The emulator already didn't have a ton of wiggle room to be emulating/simulating multiple frames per step.

Some games rely on the data that the GPU writes to the memory for the game itself (if you are familiar with Pokémon Snap for example, the game actually "looks" at the rendered picture to decide if you got a good picture), so you can't just skip the graphics when making the state.

Anyway, one day I'm sure I will figure something out, but for now, just having it spin before calling advance_frame is enough to avoid triggering a rollback.

gschup and others added 3 commits June 19, 2026 14:13
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…haviour

- builder: rewrite lockstep section to explain input delay = one-way
  latency, drop stale round-trip framing; both modes share the same
  input-delay semantics (press F → effect at F+D)
- p2p_session: document that advance_frame calls poll_remote_clients
  internally, and that an empty return in lockstep is a stall not an error
- docs/sessions.md: update builder option table rows accordingly
- docs/main-loop.md: add inline note for the lockstep stall case
- test: fix test_lockstep_stalls_without_remote_input — only assert sess1
  stalls (reliable across platforms); drop sess2 assertion which was
  non-deterministic due to OS loopback timing differences

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@gschup

gschup commented Jun 22, 2026

Copy link
Copy Markdown
Owner Author

I think this is an improvement to the codebase in any case. Will merge, but leave the issue open with a few comments.

@gschup gschup merged commit 76d48ee into main Jun 22, 2026
2 checks passed
@gschup gschup deleted the fix/lockstep-one-way-latency branch June 22, 2026 07:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lockstep mode requires more input delay than it should

2 participants