Capturing Work into a Graph
Recording a stream into a CUDA graph.
What a CUDA Graph Is
A CUDA graph records a whole sequence of kernels and copies as one reusable unit of work you can submit again and again.
Launches Add Up
Each individual kernel launch has a tiny CPU overhead. Across thousands of launches per iteration, that cost becomes real.
All lessons in this course
- Launching Kernels from a Kernel
- When Dynamic Parallelism Pays
- Capturing Work into a Graph
- Replaying Graphs to Cut Overhead