0Pricing
CUDA Academy · Lesson

Multi-GPU with NCCL

Collective communication for scaling.

Hand-Rolled Gets Hard

Wiring peer copies by hand across many GPUs becomes messy fast. For real scaling you want a library built for collective communication.

Meet NCCL

NVIDIA's answer is NCCL, the collective communications library. It moves data among GPUs using the fastest links it can find, automatically.

All lessons in this course

  1. Enumerating and Selecting Devices
  2. Partitioning Work Across GPUs
  3. Peer-to-Peer Memory Access
  4. Multi-GPU with NCCL
← Back to CUDA Academy