0PricingLogin
CUDA Academy · Lesson

The Vector Add Kernel

One thread adds one pair of elements.

The Big Idea

Vector addition is the perfect first kernel: each output is just C[i] = A[i] + B[i]. Every element is independent, so they can all run at once. 🚀

One Thread, One Element

The whole trick is simple: you assign one thread to one element. Instead of looping over the array, thousands of threads each do a single add in parallel.

All lessons in this course

  1. The Vector Add Kernel
  2. Wiring Up the Host Side
  3. Verifying the Result on the CPU
  4. Timing Your First Speedup
← Back to CUDA Academy