0Pricing
CUDA Academy · Lesson

Choosing Threads per Block

Sensible defaults like 128 and 256.

A Real Decision

When you launch a kernel, you must pick how many threads per block to use. This small choice affects real performance. 🎚️

Multiples of 32

The GPU runs threads in groups of 32 called warps, so always pick a multiple of 32. Odd sizes waste lanes in a partial warp.

All lessons in this course

  1. The Thread Hierarchy
  2. threadIdx, blockIdx, blockDim
  3. Why Blocks Exist
  4. Choosing Threads per Block
← Back to CUDA Academy