CUDA Academy · Lesson

The Double-Buffering Pipeline

Chunking data to keep the GPU fed.

The Idle GPU Problem

Copy a huge array, then run a kernel, then copy back. During each copy the GPU sits idle, and during compute the transfer engines sit idle.

The fix starts with breaking one giant array into smaller chunks. Each chunk can be copied and processed on its own, opening the door to overlap.