LightGBM: Leaf-Wise Growth and Speed Advantages
Learners will benchmark LightGBM against XGBoost on a large dataset, understand leaf-wise vs level-wise tree growth, and use categorical feature support.
What Is LightGBM?
LightGBM (Light Gradient Boosting Machine) is a gradient boosting library developed by Microsoft in 2017. It was designed specifically to address XGBoost's speed and memory limitations on large datasets. LightGBM introduced two key algorithmic innovations: Gradient-based One-Side Sampling (GOSS) for reducing the number of data instances considered at each round, and Exclusive Feature Bundling (EFB) for reducing the number of features by bundling mutually exclusive sparse features. Together these make LightGBM significantly faster than XGBoost on large tabular datasets.
Level-Wise vs Leaf-Wise Tree Growth
Most gradient boosting implementations (including XGBoost by default) grow trees level-wise: all nodes at depth 1 are split before any node at depth 2. This ensures balanced trees but wastes computation on splits that reduce loss by very little. LightGBM grows trees leaf-wise: at each step, it finds the single leaf across the entire tree that would reduce loss the most and splits it, regardless of level. This produces unbalanced trees that reduce loss faster per split, requiring fewer splits to achieve the same accuracy.
All lessons in this course
- Boosting Intuition: Sequential Error Correction
- XGBoost: Regularisation, Early Stopping, and Feature Importance
- LightGBM: Leaf-Wise Growth and Speed Advantages
- Key Hyperparameters: Learning Rate, n_estimators, and max_depth