0Pricing
CUDA Academy · Lesson

The Life of a CUDA Program

Allocate, copy, launch, copy back, free.

A Repeating Rhythm

Almost every CUDA program follows the same five-beat dance. Learn the rhythm once and you can read any GPU code. 🕺

Step 1: Allocate

First reserve device memory for your inputs and outputs with cudaMalloc, since the GPU cannot use host buffers directly.

cudaMalloc(&d_a, bytes);
cudaMalloc(&d_b, bytes);

All lessons in this course

  1. The __global__ Function Qualifier
  2. __device__ and __host__ Functions
  3. Separate Address Spaces
  4. The Life of a CUDA Program
← Back to CUDA Academy