Deep Learning Academy · Lesson

Reading and Zeroing .grad

Why gradients accumulate and must be reset.

The .grad Attribute

After backward, every trained tensor stores its gradient in .grad. Reading it tells you how the loss responds to changes in that tensor.

print(weight.grad)

Before any backward call, .grad is None, not zero. PyTorch only allocates it once the first set of gradients actually arrives.