Target Encoding and Advanced Categorical Handling
Target encoding, frequency encoding, binary encoding, embeddings for high-cardinality columns.
The Categorical Encoding Problem
Models need numbers, but categories are text. One-hot encoding works for few categories, but high-cardinality features (thousands of cities, products) create too many columns.
Limits of One-Hot and Label Encoding
One-hot explodes dimensionality. Plain label encoding invents a false ordering (city 5 is not greater than city 2). For high-cardinality data we need smarter encodings.
All lessons in this course
- Feature Selection Methods
- Creating Interaction and Polynomial Features
- Target Encoding and Advanced Categorical Handling
- Automated Feature Engineering with Featuretools