The Vector Add Kernel
One thread adds one pair of elements.
The Big Idea
Vector addition is the perfect first kernel: each output is just C[i] = A[i] + B[i]. Every element is independent, so they can all run at once. 🚀
One Thread, One Element
The whole trick is simple: you assign one thread to one element. Instead of looping over the array, thousands of threads each do a single add in parallel.
All lessons in this course
- The Vector Add Kernel
- Wiring Up the Host Side
- Verifying the Result on the CPU
- Timing Your First Speedup