Deep Learning Academy · Lesson

Discriminative Layer-Wise Rates

Train deep and shallow layers differently.

One Rate for All?

So far every layer shared a single learning rate. But early and late layers learn very different things. Maybe they deserve different rates. 🎚️

Early Layers Are General

The early layers of a pretrained net detect edges, colors, and textures. These features transfer to almost any task, so they barely need changing.

All lessons in this course

Freeze the Backbone, Train the Head
Fine-Tune with a Lower Learning Rate
Discriminative Layer-Wise Rates
Fine-Tune a Hugging Face Model

← Back to Deep Learning Academy