0PricingLogin
CUDA Academy · Lesson

cudaMemcpyAsync in a Stream

Non-blocking transfers that overlap.

Blocking by Default

Plain cudaMemcpy stops your CPU until the copy finishes. Your host thread just waits, doing nothing useful while bytes move.

The Async Cousin

Meet cudaMemcpyAsync. It queues the copy and returns to your CPU immediately, so the host can keep working while the transfer happens.

All lessons in this course

  1. Why Pageable Memory Is Slow
  2. Pinned Memory with cudaMallocHost
  3. cudaMemcpyAsync in a Stream
  4. The Double-Buffering Pipeline
← Back to CUDA Academy