Forward Caches, Backward Reuses
Why activations are stored during the forward pass.
Two Passes, One Goal
Training each batch runs two passes: a forward pass to predict and compute loss, then a backward pass to compute gradients. They work as a pair.
The Forward Pass Computes Values
Going forward, each layer turns its input into an output and sends it onward. By the end you have a prediction and a single loss number.
All lessons in this course
- The Chain Rule, Layer by Layer
- Forward Caches, Backward Reuses
- Backprop a Tiny Net by Hand
- Vanishing & Exploding Gradients