rnet-inference

A peer-to-peer swarm inference engine for Small Language Models, built on top of rnet-p2p.

rnet-inference turns a cluster of independent nodes — each running its own SLM — into a self-organizing inference swarm. A prompt is broadcast over a live floodsub mesh, a subset of nodes race to generate responses, a separate subset verifies and scores every response, and the node with the highest average score wins. No central coordinator. No trusted oracle. Just signed messages, gossip, and math.

Overview

rnet-inference answers a concrete question: can a swarm of small, cheap models produce more trustworthy outputs than a single large one, when each model both generates and judges?

The approach is deliberately minimal — no blockchain, no off-chain registry, no external oracle. Consensus emerges from the mesh itself: nodes self-select into executor or verifier roles, executors race to generate, verifiers score every response, and the leader tallies average scores to pick a winner. The entire session is coordinated over floodsub with bincode-serialized payloads, riding the same transport and multiplexing layers as the underlying rnet-p2p stack.

SLM swarm inference, with 3 executor nodes and 1 verifier node

rnet-inference.mp4

How It Works

A session moves through four stages, all coordinated over a live P2P mesh:

1. Advertise — A leader node broadcasts a task to the swarm. Other nodes see it and join in.

2. Execute — Joined nodes are randomly split into two groups: executors (who generate a response) and verifiers (who will score those responses). Every executor runs the prompt through its local model and broadcasts the result.

3. Verify — Each verifier scores every response it receives on a scale of 0.00 to 0.99, using the same local model as a judge. Scores are broadcast back to the leader.

4. Finalize — The leader averages every score per generator and picks the winner — the node whose response earned the highest average across all verifiers.

No trust required. The swarm self-organizes, and consensus comes from the scores.

Node Roles

There are two types of nodes:

Bootstrap node — runs on a fixed address, tracks who's online, and keeps the swarm's peer list in sync. Doesn't participate in inference itself.

Provider nodes — the workers. Each one connects to the bootstrap node on startup, then can act as a leader, executor, or verifier depending on the session.

Project Structure

rnet-inference/
├── inode/          # Core library — P2P node, inference logic, SLM client
├── examples/
│   ├── raw/        # Runnable binary with interactive CLI
│   └── agentic/    # Autonomous agent mode (in progress)
└── logs/

Configuration

Copy .env.example to .env. The file contains a pre-generated RSA private key for the bootstrap node and its expected multiaddr:

BOOTSTRAP_PVT_KEY=<rsa_pkcs8_hex>
BOOTSTRAP_NODE=/ip4/127.0.0.1/tcp/8888/p2p/6wrGkCFmUe2H8c23pCfPcXktZztSu2eckvqaWP6y4E2w

Provider nodes read BOOTSTRAP_NODE on startup to connect to the mesh. The bootstrap node reads BOOTSTRAP_PVT_KEY so its peer ID is deterministic and matches the multiaddr above.

Setting up local LLM

docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
docker exec -it ollama ollama run qwen2.5:1.5b


# Test it out via this command
curl http://localhost:11434/api/generate -d '{
  "model": "qwen2.5:1.5b",
  "prompt": "How are you doing",
  "stream": false
}'

Running a Bootstrap Node

cargo run --bin raw -- bootstrap

The bootstrap node will bind to port 8888 and begin tracking the mesh.

Running Provider Nodes

Open a new terminal for each provider (they bind to ephemeral ports):

# Terminal 2
cargo run --example raw

# Terminal 3
cargo run --example raw

# Terminal 4
cargo run --example raw

Each provider connects to the bootstrap node and subscribes to swarm/mesh. After a 2-second settle window, the node drops into the interactive CLI.

Interactive CLI

Command => help

      help                       => print all the commands
      local                      => get local peer-info
      connect <maddr>            => connect with a new peer
      ping <maddr> <count>       => exchange ping with a peer
      peers                      => list the connected peers

      fsub <maddr>               => open a new floodsub stream with the peer
      join <topic>               => subscribe to a new-topic
      leave <topic>              => unsubscribe to a new-topic
      publish <topic> <msg>      => publish a msg to a topic
      topics                     => list the subscribed topics
      fpeers                     => list the connected Floodsub peers
      bootmesh                   => map of topics -> peer (BOOTSTRAP)
      mesh                       => map of topics -> peer

      slm                        => converse with the AI
      adv <topic>                => advertize a exec/verify session
      ack <topic>                => acknowlege the EXECS/VERIFIERS
      finalize <topic>           => finalize the winner response

      pipe <provider_count>      => Test out the automated pipeline

Manual session example (run from one of the provider nodes):

# 1. Start advertising a session
Command => adv my-task-01

# 2. Wait for peers to join, then acknowledge and assign roles
Command => ack my-task-01

# 3. After all verifiers have scored, finalize
Command => finalize my-task-01

Local SLM test:

Command => slm What is the Byzantine Generals Problem?

Automated Pipeline

The pipe command runs a full end-to-end session programmatically — no manual ack or finalize needed. It blocks until the expected number of providers join, then drives the full four-stage protocol and prints the winning response.

# Run from the leader node; expects 3 other provider nodes to be online
Command => pipe 3

The pipeline waits for provider_count participants to appear in the task's mesh slice, assigns roles, waits for execs.len() × verifiers.len() scores to arrive, then selects and prints the winner.

Roadmap

The logs/chores.md and the agentic example stub point at active work-in-progress:

Agentic mode — autonomous provider nodes that join, bid on, and execute sessions without manual CLI interaction.
Agent tools (inode/src/agent/tools.rs) — tool-use scaffold for executor nodes to call external APIs during generation.
Timeout handling — pipeline() currently loops forever waiting for participants. A configurable deadline with graceful degradation is planned.
Dynamic Ollama endpoint — model URL is currently hardcoded to http://localhost:11434; per-node configuration is on the list.
Stake / reputation weighting — scoring is currently a flat average. A weighted scheme that discounts outlier verifiers over time would improve robustness.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
examples		examples
inode		inode
logs		logs
.env.example		.env.example
.gitignore		.gitignore
Cargo.toml		Cargo.toml
Makefile		Makefile
README.md		README.md
rust-toolchain.toml		rust-toolchain.toml
spec.md		spec.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rnet-inference

Overview

SLM swarm inference, with 3 executor nodes and 1 verifier node

How It Works

Node Roles

Project Structure

Configuration

Setting up local LLM

Running a Bootstrap Node

Running Provider Nodes

Interactive CLI

Automated Pipeline

Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

rnet-inference

Overview

SLM swarm inference, with 3 executor nodes and 1 verifier node

How It Works

Node Roles

Project Structure

Configuration

Setting up local LLM

Running a Bootstrap Node

Running Provider Nodes

Interactive CLI

Automated Pipeline

Roadmap

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages