Machine Learning Academy · Lesson

Regression Metrics: MAE, MSE, RMSE, and R-Squared

Learners will calculate error metrics on a regression output, understand scale-dependence of MAE/RMSE, and use R-squared as a standardised goodness-of-fit.

Why Regression Needs Different Metrics

Regression models predict continuous values — house prices, temperatures, sales volumes. Unlike classification, where a prediction is either correct or wrong, regression predictions are almost never exactly right. The question is: how far off is the prediction? Regression metrics quantify the difference between predicted values and actual values. Different metrics weight errors differently: some penalise large errors more than small ones, some are robust to outliers, and some are normalised to be interpretable as a fraction of variance explained. Choosing the right metric depends on the problem's error cost structure.

import numpy as np

y_true = np.array([100, 150, 200, 250, 300], dtype=float)
y_pred = np.array([110, 140, 210, 240, 320], dtype=float)

errors = y_pred - y_true
print('Actual:   ', y_true)
print('Predicted:', y_pred)
print('Errors:   ', errors)  # Some positive, some negative
print('We need metrics that summarise these errors into a single number.')

MAE: Mean Absolute Error

MAE = mean(|y_true - y_pred|). Mean Absolute Error computes the average of the absolute differences between predictions and actual values. It is in the same units as the target variable — a house price MAE of $15,000 means predictions are off by $15,000 on average. MAE treats all errors linearly: a prediction that is twice as wrong costs exactly twice as much. MAE is robust to outliers because it does not square errors. It is the recommended metric when the distribution of errors is roughly symmetric and outlier errors should not dominate the aggregate.

import numpy as np
from sklearn.metrics import mean_absolute_error

y_true = np.array([100, 150, 200, 250, 300], dtype=float)
y_pred = np.array([110, 140, 210, 240, 320], dtype=float)

# Manual calculation
mae_manual = np.mean(np.abs(y_true - y_pred))
print('Manual MAE:', mae_manual)  # |10|+|10|+|10|+|10|+|20| / 5 = 12

# Sklearn
mae = mean_absolute_error(y_true, y_pred)
print('Sklearn MAE:', mae)
print(f'Interpretation: predictions are off by {mae:.1f} units on average')

All lessons in this course

Classification Metrics: Accuracy, Precision, Recall, F1
ROC Curves and AUC-ROC
Regression Metrics: MAE, MSE, RMSE, and R-Squared
Choosing the Right Metric for Your Business Problem

← Back to Machine Learning Academy