Instruction-Level Parallelism
Giving each thread more independent work.
More Than One Thing at a Time
Inside a single thread, the GPU can keep several independent instructions in flight at once. This overlap is called instruction-level parallelism, or ILP.
Why ILP Matters
Memory and math operations take many cycles to finish. With enough independent work per thread, the hardware hides that latency instead of stalling.
All lessons in this course
- Instruction-Level Parallelism
- Loop Unrolling with #pragma unroll
- Vectorized Loads with float4
- Register Pressure and Spills