Deep Learning Academy · Lesson

DistributedDataParallel Basics

The standard multi-GPU training path.

Meet DDP

DistributedDataParallel, or DDP, is PyTorch's go-to tool for multi-GPU training. It runs one process per GPU and keeps every model copy in sync.

Unlike the older DataParallel, DDP spawns a separate process for each GPU. This avoids Python's GIL and scales far more cleanly.