A way-better ggwave: a high-performance, adaptive acoustic modem (data over sound) in Rust.
sonance sends data through the air (or a cable) as sound. Where ggwave is a multi-FSK toy (~64-128 bps, ~183-byte cap, fixed Reed-Solomon), sonance implements the DSP that research uses and ggwave skips:
- Adaptive OFDM with per-subcarrier bit-loading (BPSK..1024-QAM, water-filled from measured SNR).
- Coherent channel estimation + per-subcarrier equalization — the technique that lifts the rate over real rooms.
- Soft-decision LDPC (via
labrador-ldpc) with frequency/time interleaving for burst-noise resilience. - Fountain-coded transport (RaptorQ, RFC 6330) for reliable streaming over a lossy link.
- Chirp matched-filter synchronization + Schmidl-Cox CFO + pilot-based CFO/SFO tracking.
- One embeddable codebase: native (
cpal), WASM (Web Audio), and ano_stdcore for MCUs + a C ABI.
Robust over-the-air, device-to-device, in real rooms. Reliability and sync robustness beat the peak kbps number. Measured delivered goodput is ~2-4 kbps robust and up to ~16 kbps on a clean/close-range link with the Auto rate-ladder (see Performance) — already 50-300x ggwave. The big numbers (30-64 kbps) are cable-only; near-ultrasonic is low-rate/short-range by hardware physics.
A sans-IO codec core (Encoder/Decoder: bytes ⇄ PCM, deterministic, no audio/threads/clock) with audio behind AudioSink/AudioSource traits. This is what makes the DSP genuinely test-driven: ~90% of tests and all BER gates run by piping Encoder output through a deterministic, seeded channel simulator (sonance-sim) into Decoder — no hardware required. See docs/architecture.md for the full layered-stack diagram and rationale.
| Crate | Role |
|---|---|
sonance-dsp |
no_std PHY: modulators, FFT front-end, sync, equalizer |
sonance-link |
framing, CRC, LDPC FEC, interleaving, rate control |
sonance-transport |
RaptorQ fountain coding, chunking, BLAKE3 integrity |
sonance-crypt |
optional E2E crypto: X25519 key agreement, XChaCha20-Poly1305 AEAD, BLAKE3 KDF, 6-digit SAS |
sonance-audio |
cpal/Web Audio I/O, resampling, lock-free ring buffer, raw-capture contract |
sonance |
top-level ergonomic API |
sonance-ffi |
C ABI (cbindgen + sonance.h) |
sonance-wasm |
wasm-bindgen bindings + browser demo |
sonance-cli |
send/recv/probe/fieldtest demo binary |
sonance-sim |
deterministic channel simulator (AWGN, reverb IRs, distortion, clipping) |
sonance-testutil |
shared test fixtures + helpers used across the BER/round-trip suites |
xtask |
BER/SNR sweeps, goodput benchmarks, report + header generation |
The sans-IO core round-trips bytes through PCM with no audio hardware — encode to samples, push them into a decoder, read the message back:
use sonance::codec::{Decoder, Encoder};
use sonance::Profile;
// Encode bytes to PCM samples.
let encoder = Encoder::new(Profile::Balanced);
let pcm = encoder.encode(b"hello over sound");
// Decode: push the PCM (here in one go) and pop the message.
let mut decoder = Decoder::new(Profile::Balanced);
decoder.push(&pcm);
decoder.flush(); // finite input: force a final decode attempt
assert_eq!(decoder.next_message().as_deref(), Some(&b"hello over sound"[..]));In practice the pcm would travel out a speaker and back in via a mic (or
through sonance-sim in tests). The Sender/Receiver wrappers in
sonance::io weld this to real audio devices behind the
AudioSink/AudioSource traits.
Built test-first, milestone by milestone — all milestones are implemented:
- Spike 0 / M0 — de-risked sync (chirp detects + times a frame at 0-sample error down to -5 dB SNR over a reverb IR, and a real laptop speaker/mic round-trip at matched-filter peak 0.86), workspace skeleton, deterministic channel simulator, criterion baselines, CI.
- M1 — Balanced OFDM PHY (sans-IO
Encoder/Decoder), chirp + Schmidl-Cox sync, CRC framing. - M2 — pilot CFO/SFO tracking, equalizer, interleaver + soft LDPC, AGC/AFE,
cpalI/O, CLI. - M3 — RaptorQ transport + streaming, near-ultrasonic profile, WASM browser demo.
- M4 —
Autoadaptive bit-loading, CSS Robust + Cable profiles, C ABI,no_std, bindings, docs.
On top of the milestones, a speed + cleanup pass added superframe batching (many transport packets behind one preamble), a Fast PHY profile (shorter CP + chirp), and a wired end-to-end Auto rate-ladder path.
Delivered goodput (info bits/sec, including inter-frame gaps) measured by cargo run -p xtask --release -- goodput on a clean link (AWGN 32 dB, 180-byte MTU). "b/s" is real payload throughput, not channel symbol rate.
| Payload | Balanced (1 pkt/frame) | Balanced superframe x8 | Fast superframe x8 | Auto superframe x8 |
|---|---|---|---|---|
| 256 B | 1,879 b/s | 2,108 | 2,365 | 7,358 |
| 1 KB | 2,505 b/s | 3,638 | 4,059 | 15,031 |
| 4 KB | 2,505 b/s | 3,518 | 3,921 | 15,493 |
| 16 KB | 2,559 b/s | 3,600 | 4,013 | 15,792 |
Two levers do the work:
- Superframe batching packs N transport packets behind a single ~93 ms preamble (40 ms chirp + Schmidl-Cox training symbol) instead of paying it per packet — a 16 KB transfer drops from 94 preambles to 12.
- The
Autorate-ladder measures channel SNR and climbs from the conservative QPSK r1/2 default up toward 1024-QAM r4/5 (8 bits/symbol) when the link allows, the ~4x multiplier on top of superframe.
These are clean-link numbers. Real over-the-air between two laptops lands lower and SNR-dependent — Auto backs off automatically — and the reverb + 25 ppm clock-drift reliability floor is validated separately by xtask sweep-room.
cargo test --workspace # unit + integration + BER gates
cargo bench --workspace # criterion micro/macro benches
cargo build -p sonance-dsp --no-default-features --target wasm32-unknown-unknown # no_std coreFuzz targets (nightly + libfuzzer, excluded from the default workspace):
cargo +nightly fuzz run frame_decode
cargo +nightly fuzz run decoder_pushThe spikes/ directory holds the frozen de-risk prototypes (Spike 0 / M0) and
is intentionally excluded from the workspace. See CONTRIBUTING.md
for the full development workflow (sans-IO + channel-sim TDD, xtask sweeps, PR
expectations).
A browser send/receive demo (Web Audio) lives in sonance-wasm/demo. Build the wasm bindings, then serve the directory:
cd sonance-wasm
wasm-pack build --target web --out-dir demo/pkg --release -- --features encryption
python3 -m http.server 8080
# open http://localhost:8080/demo/Pick a profile, type or drop in a file, and play it out the speakers; a second tab (or device) listening on the mic decodes it back byte-exact. The Auto profile measures the link and picks the fastest rung it can decode. Tick Encrypt and share a passphrase to seal the payload end-to-end (XChaCha20-Poly1305) — both sides see a 6-digit safety number that must match. Building without -- --features encryption simply omits the Encrypt controls.
Sound is a broadcast medium — anyone in earshot can decode. By default payloads are opaque but not encrypted: the base sonance is a modem, not a secure channel. Transmission is not confidential unless you turn on the encryption layer below.
sonance ships an optional, off-by-default encryption layer (the sonance-crypt crate) designed to keep payloads confidential even against a well-resourced, recording adversary. It is built entirely from audited RustCrypto primitives:
- AEAD: XChaCha20-Poly1305 (the WireGuard/libsodium class) seals the payload. A fresh random 24-byte nonce is generated per message, so callers never manage nonces, and any tampering (a flipped bit, a wrong key, mismatched associated data) yields no output rather than forged plaintext.
- Key agreement: ephemeral X25519 Diffie-Hellman. Because the keys are ephemeral and discarded after the session, you get forward secrecy — a later device seizure can't decrypt recordings of past sessions. The shared secret plus an order-independent transcript of both public keys is run through a BLAKE3
derive_keyKDF to produce the 32-byte session key. - MITM defense (SAS): both ends derive a 6-digit short authentication string from the handshake transcript (the ZRTP / Signal "safety number" pattern). An active man-in-the-middle must substitute different ephemeral keys with each side, which makes the two SAS values differ — so a human reading the 6 digits aloud detects the attack.
Turn it on per crate with the encryption Cargo feature:
cargo test -p sonance --features encryptionTwo devices in the same room can negotiate a key over sound alone, no pre-shared secret — the sonance::handshake module runs a two-way X25519 exchange over the same acoustic PHY (one fixed-format handshake frame each way, with turn-taking). Both ends come away with the same SessionKey and the same SAS:
use sonance::{Profile, handshake::{run, Role}};
use sonance::crypt::rand_core::OsRng;
// Each side runs this with opposite roles over its own sink + mic source.
let hs = run(Profile::Balanced, &mut sink, &mut mic, Role::Initiator, OsRng, 64)?;
println!("Confirm this matches your peer's screen: {}", hs.sas.digits());
// Then attach the derived key to the stream:
let mut tx = StreamSender::new(Profile::Balanced, sink, 180).with_key(hs.key);StreamReceiver::with_key / the one-shot Sender/Receiver with_key mirror this on the read side. The same knobs are surfaced on the FFI (sonance_stream_encode_from_encrypted + sonance_stream_decoder_set_key) and WASM (encode_stream_encrypted, WasmStreamDecoder.set_key, and WasmHandshake exposing the key + SAS) bindings.
When there is no return path to run a handshake (one-way broadcast), derive the same key on both ends from a shared passphrase and salt:
let key = sonance::crypt::SessionKey::from_passphrase(b"correct horse battery staple", b"session-salt");- Off by default, byte-identical when off. With the feature disabled (or enabled but no key attached), the wire bytes are identical to the unencrypted build — there is no format change and no crypto dependency in the default build. This is guarded by a byte-diff test (
none_key_is_plaintext_passthrough). - What it protects: payload confidentiality and integrity, forward secrecy, and (via the SAS) active-MITM detection — strong enough to be meaningful against a state-level eavesdropper recording the audio.
- What it does not hide: that a transmission is happening, its timing, and its approximate length (traffic analysis). It is a confidentiality layer, not a covert-channel / steganographic one.
MIT. See LICENSE.