Kernel Metrics in Nsight Compute
Throughput, stalls, and roofline data.
Zoom Into One Kernel
Nsight Compute is the microscope for a single kernel. It collects deep hardware metrics so you can see exactly why it runs slow. 🔬
How to Launch It
You profile a kernel with ncu from the command line. It replays the kernel many times to gather detailed counters.
ncu -o report ./my_cuda_appAll lessons in this course
- Timeline View in Nsight Systems
- Kernel Metrics in Nsight Compute
- Compute-Bound vs Memory-Bound
- Annotating Code with NVTX