Profile the Bottleneck
Find where time and memory go.
Why Profile First
Before optimizing, find out where time actually goes. Guessing wastes effort, while a quick profile shows you the real slow spots.
Two Common Bottlenecks
Training usually stalls in one of two places: the GPU compute doing math, or the data pipeline feeding it. Knowing which one matters.