-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Pull requests: deepspeedai/DeepSpeed
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Warn when zero.Init silently falls back to a single rank (#8084)
#8089
opened Jun 24, 2026 by
akshansh47
Loading…
fix: AutoTP partition_config uses full hierarchical module path
#8088
opened Jun 24, 2026 by
delock
Collaborator
Loading…
fix: use local ev_values and wrap dict.values() in list()
#8087
opened Jun 23, 2026 by
hashwnath
Loading…
3 tasks done
fix: add buffer-length check in shm.cpp
#8082
opened Jun 20, 2026 by
orbisai0security
Contributor
Loading…
3 tasks done
fix: sanitize subprocess call in ds_aio_job.py
#8081
opened Jun 20, 2026 by
orbisai0security
Contributor
Loading…
3 tasks done
ZeRO 1/2: wait on all IPG-bucket producer streams in average_tensor (#8061)
#8080
opened Jun 19, 2026 by
arunshar
Contributor
Loading…
Avoid CUDA context initialization during op compatibility checks at import
#8078
opened Jun 19, 2026 by
Achyuthan-S
Loading…
Add configurable engine log level
#8067
opened Jun 15, 2026 by
sfc-gh-truwase
Collaborator
Loading…
2 tasks
feat: add Trackio as a new experiment monitoring backend
#8065
opened Jun 15, 2026 by
chanduripranav
Loading…
Support AutoEP with ZeRO-3 zero.Init source modules
#8060
opened Jun 11, 2026 by
tohtana
Collaborator
Loading…
[DeepCompile] fix gather params in dynamo skipped frames for ZeRO3
#8059
opened Jun 11, 2026 by
XAheli
Loading…
7 tasks done
feat(zenflow): run the overlapped CPU optimizer in a native process
#8058
opened Jun 10, 2026 by
Antlera
Collaborator
Loading…
Fix eigenvalue parsing for compression-only quantize configs
#8057
opened Jun 10, 2026 by
sowndappan5
Contributor
Loading…
Add optional torchembed RoPE backend to apply_rotary_pos_emb
#8052
opened Jun 7, 2026 by
py-ai-dev
Loading…
Fix minor comment/docstring typos in runtime and inference modules
#8046
opened Jun 3, 2026 by
nathon-lee
Contributor
Loading…
zero3: defer param release during retain_graph backward #7352
#8045
opened Jun 3, 2026 by
nathon-lee
Contributor
Loading…
Enable bf16 check_grad_overflow by default (matching fp16)
#8035
opened May 29, 2026 by
yongzhe-wang
Loading…
2 tasks done
[Draft] Add ZeRO-3 elastic checkpoint save/load support
#8031
opened May 28, 2026 by
nathon-lee
Contributor
•
Draft
[Draft] Add On-Policy Distillation (OPSD) Trainer in DeepSpeed
#8027
opened May 26, 2026 by
PKUWZP
Collaborator
Loading…
3 of 5 tasks
Previous Next
ProTip!
What’s not been updated in a month: updated:<2026-05-24.