Grid Search vs Random Search
Learners will configure GridSearchCV and RandomizedSearchCV on the same hyperparameter space, compare their coverage and computation cost, and choose the faster for large spaces.
The Hyperparameter Search Problem
Most machine learning models have multiple hyperparameters that cannot be learned from data and must be set by the practitioner. Finding the optimal combination manually is impractical — the number of combinations grows exponentially with the number of hyperparameters. Two systematic approaches dominate: Grid Search, which evaluates every combination on a predefined grid, and Random Search, which samples combinations randomly from specified distributions. Understanding when to use each is a crucial practical skill.
Grid Search: Exhaustive Evaluation
GridSearchCV evaluates every combination of the hyperparameter values you specify. For a grid with 4 values of C, 5 values of gamma, and 5 CV folds, it trains 4 × 5 × 5 = 100 models. This guarantees you find the best combination within your grid. However, the cost grows multiplicatively: adding a third hyperparameter with 4 values multiplies the search to 400 models. Grid search works well when you have 1-2 hyperparameters and a manageable grid size.
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.datasets import load_breast_cancer
import numpy as np
X, y = load_breast_cancer(return_X_y=True)
pipe = Pipeline([('sc', StandardScaler()), ('svc', SVC())])
param_grid = {'svc__C': [0.1, 1, 10], 'svc__gamma': [0.01, 0.1, 1]}
# 3 * 3 * 5 folds = 45 model fits
grid = GridSearchCV(pipe, param_grid, cv=5, n_jobs=-1)
grid.fit(X, y)
print('Best params:', grid.best_params_)
print('Best CV score:', round(grid.best_score_, 4))