0Pricing
CUDA Academy · Lesson

The PCIe Transfer Bottleneck

Why copies are often the slow part.

The Bridge Between Worlds

The CPU and GPU live on separate boards connected by the PCIe bus. Every cudaMemcpy must squeeze through this narrow bridge. 🌉

A Speed Mismatch

GPU memory bandwidth can be terabytes per second, but PCIe delivers only tens of gigabytes. The bus is the slow link.

All lessons in this course

  1. Host-to-Device Transfers
  2. Device-to-Host Transfers
  3. The Copy Direction Enum
  4. The PCIe Transfer Bottleneck
← Back to CUDA Academy