Deep Learning Academy · Lesson

ReLU and Its Leaky & GELU Cousins

The default activation and modern variants.

Meet ReLU

The most popular activation is ReLU: it keeps positive values and turns every negative one into zero. Simple and fast. ⚡

import torch.nn.functional as F
y = F.relu(x)   # max(0, x), elementwise

ReLU is cheap to compute and its gradient is a clean 1 for positives. That keeps signals flowing and makes deep nets train quickly.