CUDA Academy · Lesson

cuBLAS GEMM Done Right

Handles, column-major, and leading dims.

Stand on NVIDIA's Shoulders

Writing a fast matmul by hand is hard. cuBLAS ships a battle-tested GEMM that already squeezes your GPU near peak performance. 🚀

GEMM stands for general matrix-matrix multiply. It computes C = alpha * A * B + beta * C, the workhorse behind graphics and deep learning.

C = alpha * (A * B) + beta * C