Evaluation and Inference: From Logits to Predicted Labels
Learners will compute softmax over logits, decode predicted class indices to labels, and evaluate the fine-tuned model with accuracy and F1 on a held-out test set.
What Are Logits?
The output of a classification model's final linear layer is called logits: raw, un-normalised scores for each class. For a 2-class problem (negative/positive), a logit vector might be [-1.2, 3.5], meaning the model strongly favours class 1 (positive). Logits can be any real number; they are not probabilities. We need an additional step (softmax or sigmoid) to convert them into interpretable probability values that sum to 1.
import torch
import torch.nn.functional as F
# Example logits for 2-class problem (batch of 3)
logits = torch.tensor([[-1.2, 3.5], [2.1, -0.3], [0.4, 0.6]])
print('Logits:\n', logits)
# Convert to probabilities with softmax
probs = F.softmax(logits, dim=1)
print('Probabilities:\n', probs)
# Each row sums to 1
print('Row sums:', probs.sum(dim=1))Softmax: Converting Logits to Probabilities
The softmax function converts a vector of real-valued logits into a probability distribution. For class i: softmax(z_i) = exp(z_i) / sum(exp(z_j)). The exponential amplifies differences: a logit difference of 2 leads to one class being about 7x more probable than the other. Softmax is used for multi-class classification where exactly one class is correct.
import torch
import torch.nn.functional as F
import numpy as np
logits = torch.tensor([1.0, 3.0, 0.5]) # 3 classes
probs = F.softmax(logits, dim=0)
print('Class probabilities:', probs.numpy().round(4))
# [0.0900, 0.6652, 0.2449] -- class 1 is most likely
print('Sum:', probs.sum().item()) # 1.0
# argmax gives the predicted class index
pred_class = torch.argmax(probs).item()
print('Predicted class:', pred_class) # 1All lessons in this course
- Transformer Architecture: Attention, Tokens, and Context
- Hugging Face Tokenizers: Encoding Text for BERT
- Fine-Tuning BertForSequenceClassification
- Evaluation and Inference: From Logits to Predicted Labels