Choosing Threads per Block
Sensible defaults like 128 and 256.
A Real Decision
When you launch a kernel, you must pick how many threads per block to use. This small choice affects real performance. 🎚️
Multiples of 32
The GPU runs threads in groups of 32 called warps, so always pick a multiple of 32. Odd sizes waste lanes in a partial warp.
All lessons in this course
- The Thread Hierarchy
- threadIdx, blockIdx, blockDim
- Why Blocks Exist
- Choosing Threads per Block