Machine Learning Academy · Lesson

Why You Cannot Evaluate on Training Data

Learners will demonstrate data leakage by evaluating a memorised model and see why held-out test data is essential for honest performance estimates.

The Evaluation Trap

After training a model, the most tempting thing to do is test it on the same data you used for training. The model will likely score very high — sometimes 100% accuracy — and this feels like success. It is not. This is the most fundamental mistake in machine learning, and it produces results that are completely useless for predicting real-world performance.

Understanding why this fails is not just a technicality — it changes how you think about the entire goal of machine learning. The goal is never to perform well on training data. The goal is always to generalise to new, unseen data.

Memorisation vs Generalisation

Consider the difference between a student who memorises every answer in an exam prep book versus one who actually understands the material. The first student scores perfectly on every practice problem but fails when the real exam has slightly different phrasing. The second student may not score perfectly on practice problems but handles new questions confidently.

An ML model that 'memorises' training examples (a deeply overfitted model) behaves exactly like the first student. It achieves perfect training accuracy but fails on new inputs. This phenomenon is called data leakage when it happens during evaluation — you have 'leaked' the answers into the test.

All lessons in this course

Why You Cannot Evaluate on Training Data
train_test_split: Ratios, Seeds, and Stratification
Bias-Variance Trade-off: Underfitting vs Overfitting
Baseline Models: Always Beat the Dummy Classifier

← Back to Machine Learning Academy