Deep Learning Academy · Lesson

The Chain Rule, Layer by Layer

Compose derivatives through the network.

A Net Is Nested Functions

A neural network is just functions wrapped inside functions: each layer feeds its output into the next. Backprop needs the derivative of that whole stack.

The Chain Rule in One Line

The chain rule says the derivative of nested functions multiplies the derivatives of each piece. That single fact is the engine behind all of backprop. ⛓️

dy/dx = dy/du * du/dx

All lessons in this course

The Chain Rule, Layer by Layer
Forward Caches, Backward Reuses
Backprop a Tiny Net by Hand
Vanishing & Exploding Gradients

← Back to Deep Learning Academy