Cut GPU Memory Usage
Checkpointing and smarter tensor handling.
The Dreaded OOM
Run out of GPU memory and training crashes with an out-of-memory error. The good news is several simple tactics free up space fast.
Where Memory Goes
Your GPU holds the model weights, the gradients, the optimizer state, and the activations saved for backward. Activations are often the biggest.