Classification Metrics: Accuracy, Precision, Recall, F1
Learners will compute all four metrics on the same predictions and understand which to prioritise depending on the cost of false positives vs false negatives.
Why Accuracy Is Often Not Enough
Accuracy — the fraction of correct predictions — is the most intuitive metric, but it is misleading when classes are imbalanced. Consider a medical test for a disease that affects 1% of patients. A model that always predicts 'healthy' achieves 99% accuracy while being completely useless — it misses every actual disease case. In fraud detection, cancer screening, or rare event prediction, accuracy hides the model's failure to detect the rare but critical positive class. This is why we need metrics that separately measure performance on each class: precision, recall, and F1-score.
import numpy as np
# 1000 patients: 990 healthy, 10 sick
y_true = [0]*990 + [1]*10
y_pred_dummy = [0]*1000 # Always predict healthy
correct = sum(p == t for p, t in zip(y_pred_dummy, y_true))
accuracy = correct / len(y_true)
print(f'Dummy accuracy: {accuracy:.1%}') # 99.0% -- misleadingly high!
print('But 0 sick patients correctly identified!')
print('This is why we need precision and recall.')The Confusion Matrix: Ground Truth
The confusion matrix is the foundation of all classification metrics. For binary classification, it is a 2x2 table: True Positives (TP) — correctly predicted positive; True Negatives (TN) — correctly predicted negative; False Positives (FP) — predicted positive but actually negative (Type I error); False Negatives (FN) — predicted negative but actually positive (Type II error). Every classification metric is derived from some combination of these four numbers. Inspecting the raw confusion matrix reveals which type of error the model makes most.
from sklearn.metrics import confusion_matrix
import numpy as np
y_true = [1, 0, 1, 1, 0, 0, 1, 0, 0, 1]
y_pred = [1, 0, 1, 0, 0, 1, 1, 0, 0, 0]
cm = confusion_matrix(y_true, y_pred)
print('Confusion Matrix:')
print(cm)
# [[TN, FP],
# [FN, TP]]
TN, FP, FN, TP = cm.ravel()
print(f'TP={TP}, TN={TN}, FP={FP}, FN={FN}')All lessons in this course
- Classification Metrics: Accuracy, Precision, Recall, F1
- ROC Curves and AUC-ROC
- Regression Metrics: MAE, MSE, RMSE, and R-Squared
- Choosing the Right Metric for Your Business Problem