0Pricing
Learn AI with Python · Lesson

Random Forests and Bagging

Ensemble concept, RandomForestClassifier, n_estimators, feature importance, OOB score.

The Problem with Single Trees

A single decision tree has high variance: retraining on slightly different data can produce a very different tree. Ensembles fix this by combining many trees.

Bagging (Bootstrap Aggregating)

Bagging trains many models on different bootstrap samples (random sampling with replacement) of the data, then averages their predictions.

Averaging many high-variance models reduces overall variance without increasing bias much.

from sklearn.ensemble import BaggingClassifier
from sklearn.tree import DecisionTreeClassifier

bag = BaggingClassifier(
    estimator=DecisionTreeClassifier(),
    n_estimators=100,
    random_state=0,
)
bag.fit(Xtr, ytr)

All lessons in this course

  1. Decision Trees: Theory and Implementation
  2. Random Forests and Bagging
  3. Gradient Boosting: GBM and XGBoost
  4. LightGBM and CatBoost
← Back to Learn AI with Python