0PricingLogin
CUDA Academy · Lesson

Killing Warp Divergence

Reindexing to keep warps busy.

Warps Run in Lockstep

A warp is 32 threads that execute the same instruction together. When their paths agree, the hardware runs at full speed.

What Divergence Costs

If threads in a warp take different branches, that is divergence. The hardware runs each path serially, leaving some lanes idle and wasting cycles.

All lessons in this course

  1. The Reduction Tree Idea
  2. Killing Warp Divergence
  3. Sequential Addressing
  4. Multi-Block Final Reduction
← Back to CUDA Academy