feat(backends): add NanoDeploy backend with dlslime-ctrl discovery by JimyMa · Pull Request #15 · DeepLink-org/DLRouter

JimyMa · 2026-06-02T16:55:11Z

Summary

Integrate NanoDeploy's single-process OpenAI server (nanodeploy serve) as a first-class DLRouter backend (--backend nanodeploy).
Add BackendType.NANODEPLOY and the nanoctrl service-discovery mode that polls a dlslime-ctrl entity registry for nanodeploy nodes and reconciles their HTTP endpoints into the NodeManager (served-model-name, model-path, and basename aliases).
Auto-discovery activates in hybrid serving when --ctrl_address is set; manual POST /nodes/add still works otherwise.
Docs: README updated with a supported-backend row, a dedicated NanoDeploy + dlslime-ctrl quick start, and a request example.

Running NanoDeploy with DLRouter

1. Start the dlslime-ctrl control plane (only needed for auto-discovery)

dlslime-ctrl server --redis-url redis://127.0.0.1:6379

2. Start the NanoDeploy OpenAI server

# inside the nanodeploy conda env
nanodeploy serve /path/to/Qwen3-0.6B \
  --host 0.0.0.0 --port 8100 \
  --served-model-name Qwen3-0.6B \
  --ctrl_address 127.0.0.1:4479

Notes:

The positional argument is the model path (you can also use --model /path/to/...).
--served-model-name is the public model id; if omitted it defaults to the basename of the model path.
--ctrl_address enables self-registration + heartbeat to dlslime-ctrl. Omit it to run as a standalone HTTP server.
All other Config fields (--ray_address, --tp, etc.) share the same names/semantics as engine_server.py. --host/--port bind the uvicorn HTTP API.

3. Call NanoDeploy directly (bypass DLRouter, verify the server itself)

curl http://localhost:8100/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"Qwen3-0.6B","messages":[{"role":"user","content":"Hello"}]}'

Other endpoints:

curl http://localhost:8100/health        # health check
curl http://localhost:8100/v1/models     # served-name / path / basename are all aliases

# /v1/completions (text completion)
curl http://localhost:8100/v1/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"Qwen3-0.6B","prompt":"Once upon a time","max_tokens":64}'

# streaming
curl -N http://localhost:8100/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"Qwen3-0.6B","messages":[{"role":"user","content":"Hello"}],"stream":true}'

4. Call through DLRouter (end-to-end)

# DLRouter auto-discovers NanoDeploy nodes from dlslime-ctrl
python -m dlrouter \
  --backend nanodeploy \
  --serving_strategy hybrid \
  --ctrl_address 127.0.0.1:4479

# Request hits port 8000 (DLRouter), which forwards to 8100 (NanoDeploy)
curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"Qwen3-0.6B","messages":[{"role":"user","content":"Hello"}]}'

Without dlslime-ctrl, drop --ctrl_address on the DLRouter side and register the node manually:

curl -X POST http://localhost:8000/nodes/add \
  -H "Content-Type: application/json" \
  -d '{"url":"http://127.0.0.1:8100"}'

Test plan

pytest tests/core/test_nanoctrl_discovery.py tests/backends/test_backend_contracts.py
End-to-end: nanodeploy serve + dlslime-ctrl + python -m dlrouter --backend nanodeploy --serving_strategy hybrid --ctrl_address 127.0.0.1:4479, then a /v1/chat/completions curl (verified working manually).
NV, PPU, muxi pd distserve test.

Integrate NanoDeploy's single-process OpenAI server (`nanodeploy serve`) as a first-class DLRouter backend. Adds the `nanodeploy` BackendType and the `nanoctrl` service-discovery mode, which polls a dlslime-ctrl entity registry for `nanodeploy` nodes and reconciles their HTTP endpoints (served model name, model path, and basename aliases) into the NodeManager. Auto-discovery activates in hybrid serving when `--ctrl_address` is set; manual `POST /nodes/add` still works otherwise. Co-authored-by: Cursor <cursoragent@cursor.com>

Implement prefill/decode disaggregation for the NanoDeploy backend: - supports_pd_disagg() now returns True and handle_pd_request runs the two-stage flow: prefill node returns a KV migration payload, decode node RDMA-pulls the KV and generates the completion, then prefill KV blocks are released via POST /pd/free. - Forward kv_transfer_params to NanoDeploy serve nodes. - When the prefill node fully finishes a request locally (e.g. first token is EOS) it returns no migration payload; return that completion directly (with a streaming SSE fallback) instead of erroring. - nanoctrl discovery maps entity metadata.role -> EngineRole PREFILL/DECODE/HYBRID instead of always HYBRID. - Update backend contract and discovery tests accordingly. Co-authored-by: Cursor <cursoragent@cursor.com>

CLAassistant · 2026-06-11T05:21:20Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
0 out of 3 committers have signed the CLA.

❌ HuWen7
❌ JimyMa
❌ FirwoodLin
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

JimyMa requested a review from Denny991 June 3, 2026 06:05

JimyMa assigned caikun-pjlab and Denny991 Jun 3, 2026

JimyMa requested review from caikun-pjlab and removed request for Denny991 June 3, 2026 06:09

JimyMa and others added 4 commits June 7, 2026 12:53

Update README.md

46f3d7c

Update README.md

788c0f0

Rename NanoDeploy backend to DLEngine

7d121c0

FirwoodLin merged commit 4f902a8 into main Jun 11, 2026
0 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(backends): add NanoDeploy backend with dlslime-ctrl discovery#15

feat(backends): add NanoDeploy backend with dlslime-ctrl discovery#15
FirwoodLin merged 5 commits into
mainfrom
init_nanodeploy_backend

JimyMa commented Jun 2, 2026 •

edited by Denny991

Loading

Uh oh!

CLAassistant commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

JimyMa commented Jun 2, 2026 • edited by Denny991 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Running NanoDeploy with DLRouter

1. Start the dlslime-ctrl control plane (only needed for auto-discovery)

2. Start the NanoDeploy OpenAI server

3. Call NanoDeploy directly (bypass DLRouter, verify the server itself)

4. Call through DLRouter (end-to-end)

Test plan

Uh oh!

CLAassistant commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

JimyMa commented Jun 2, 2026 •

edited by Denny991

Loading