0Pricing
CUDA Academy · Lesson

The Double-Buffering Pipeline

Chunking data to keep the GPU fed.

The Idle GPU Problem

Copy a huge array, then run a kernel, then copy back. During each copy the GPU sits idle, and during compute the transfer engines sit idle.

Split the Work

The fix starts with breaking one giant array into smaller chunks. Each chunk can be copied and processed on its own, opening the door to overlap.

All lessons in this course

  1. Why Pageable Memory Is Slow
  2. Pinned Memory with cudaMallocHost
  3. cudaMemcpyAsync in a Stream
  4. The Double-Buffering Pipeline
← Back to CUDA Academy