Deep Learning Academy · Lesson

Forward Caches, Backward Reuses

Why activations are stored during the forward pass.

Two Passes, One Goal

Training each batch runs two passes: a forward pass to predict and compute loss, then a backward pass to compute gradients. They work as a pair.

Going forward, each layer turns its input into an output and sends it onward. By the end you have a prediction and a single loss number.