Skip to content

Add OPSD (On-Policy Distillation) training example#1002

Open
delock wants to merge 1 commit into
masterfrom
gma/opsd
Open

Add OPSD (On-Policy Distillation) training example#1002
delock wants to merge 1 commit into
masterfrom
gma/opsd

Conversation

@delock

@delock delock commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

This PR moved example code from @PKUWZP 's OPSD PR in DeepSpeed (deepspeedai/DeepSpeed#8027).

Note the basic OPSD infrastructures (OPSD trainer, rollout engines) are still in the original PR and need to be merged seperately. This PR will work when the original PR merged.

Entry point, configs, data, and tests for on-policy distillation
using DeepSpeed's hybrid engine rollout and vLLM backend.

Signed-off-by: Guokai Ma <guokai.ma@intel.com>
Signed-off-by: Guokai Ma <guokai.ma@gmail.com>
@delock delock requested a review from tjruwase as a code owner June 24, 2026 09:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant