0Pricing
CUDA Academy · Lesson

The Classic Index Formula

blockIdx.x * blockDim.x + threadIdx.x.

One Thread, One Element

The whole point of a CUDA kernel is that every thread handles one piece of data. To do that, each thread needs a unique global index.

Why Local IDs Are Not Enough

Inside a block, threadIdx.x only counts 0 up to blockDim minus one. Many blocks reuse those same small numbers, so it cannot be your final index.

All lessons in this course

  1. The Classic Index Formula
  2. Guarding Against Out-of-Range
  3. Rounding Up the Block Count
  4. Grid-Stride Loops
← Back to CUDA Academy