Add OPSD (On-Policy Distillation) training example by delock · Pull Request #1002 · deepspeedai/DeepSpeedExamples

delock · 2026-06-24T09:30:42Z

This PR moved example code from @PKUWZP 's OPSD PR in DeepSpeed (deepspeedai/DeepSpeed#8027).

Note the basic OPSD infrastructures (OPSD trainer, rollout engines) are still in the original PR and need to be merged seperately. This PR will work when the original PR merged.

Entry point, configs, data, and tests for on-policy distillation using DeepSpeed's hybrid engine rollout and vLLM backend. Signed-off-by: Guokai Ma <guokai.ma@intel.com> Signed-off-by: Guokai Ma <guokai.ma@gmail.com>

Add OPSD (On-Policy Distillation) training example

15faa6c

Entry point, configs, data, and tests for on-policy distillation using DeepSpeed's hybrid engine rollout and vLLM backend. Signed-off-by: Guokai Ma <guokai.ma@intel.com> Signed-off-by: Guokai Ma <guokai.ma@gmail.com>

delock requested a review from tjruwase as a code owner June 24, 2026 09:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add OPSD (On-Policy Distillation) training example#1002

Add OPSD (On-Policy Distillation) training example#1002
delock wants to merge 1 commit into
masterfrom
gma/opsd

delock commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

delock commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant