0PricingLogin
CUDA Academy · Lesson

Peer-to-Peer Memory Access

Direct GPU-to-GPU copies over NVLink.

The Slow Detour

Moving data from one GPU to another usually bounces through the CPU's memory first. That round trip is slow and wastes the host's bandwidth.

GPUs Talking Directly

Modern GPUs can skip the CPU entirely. Peer-to-peer access lets one GPU read and write another's memory over a direct link.

All lessons in this course

  1. Enumerating and Selecting Devices
  2. Partitioning Work Across GPUs
  3. Peer-to-Peer Memory Access
  4. Multi-GPU with NCCL
← Back to CUDA Academy