0PricingLogin
CUDA Academy · Lesson

What Tensor Cores Compute

Fused matrix multiply-accumulate units.

Meet the Tensor Core

A Tensor Core is a special hardware unit on modern NVIDIA GPUs built to do one job blazingly fast: small matrix math. 🚀

One Operation: MMA

Tensor Cores compute a matrix multiply-accumulate, written D = A times B plus C. They do the whole multiply-and-add in one shot.

All lessons in this course

  1. What Tensor Cores Compute
  2. Mixed Precision: FP16, BF16, TF32
  3. The WMMA Fragment API
  4. Numerical Stability Tradeoffs
← Back to CUDA Academy