Cross-Validation Strategies
k-fold, stratified k-fold, leave-one-out, time-series split — when to use each.
Why Cross-Validation
A single train/test split can be lucky or unlucky. Cross-validation repeats the evaluation on different splits and averages the scores, giving a more reliable estimate of performance.
K-Fold Cross-Validation
KFold splits data into K equal parts. Each fold is used once as the test set while the rest train the model. You get K scores to average.
from sklearn.model_selection import KFold
kf = KFold(n_splits=5, shuffle=True, random_state=0)
for train_idx, test_idx in kf.split(X):
Xtr, Xte = X[train_idx], X[test_idx]
ytr, yte = y[train_idx], y[test_idx]
# train and evaluate hereAll lessons in this course
- Cross-Validation Strategies
- Classification Metrics Deep Dive
- Grid Search and Random Search
- Bayesian Optimization with Optuna