Reading and Zeroing .grad
Why gradients accumulate and must be reset.
The .grad Attribute
After backward, every trained tensor stores its gradient in .grad. Reading it tells you how the loss responds to changes in that tensor.
print(weight.grad)Gradients Start as None
Before any backward call, .grad is None, not zero. PyTorch only allocates it once the first set of gradients actually arrives.
All lessons in this course
- requires_grad and the Computation Graph
- Call backward() to Get Gradients
- Reading and Zeroing .grad
- torch.no_grad() for Inference