DistributedDataParallel Basics
The standard multi-GPU training path.
Meet DDP
DistributedDataParallel, or DDP, is PyTorch's go-to tool for multi-GPU training. It runs one process per GPU and keeps every model copy in sync.
One Process per GPU
Unlike the older DataParallel, DDP spawns a separate process for each GPU. This avoids Python's GIL and scales far more cleanly.
All lessons in this course
- Data vs Model Parallelism
- DistributedDataParallel Basics
- Sync Batch Norm & Sharded State
- Launch Jobs with torchrun