What Tensor Cores Compute
Fused matrix multiply-accumulate units.
Meet the Tensor Core
A Tensor Core is a special hardware unit on modern NVIDIA GPUs built to do one job blazingly fast: small matrix math. 🚀
One Operation: MMA
Tensor Cores compute a matrix multiply-accumulate, written D = A times B plus C. They do the whole multiply-and-add in one shot.
All lessons in this course
- What Tensor Cores Compute
- Mixed Precision: FP16, BF16, TF32
- The WMMA Fragment API
- Numerical Stability Tradeoffs