The Kernel Trick: RBF, Polynomial, and Sigmoid Kernels
Learners will apply RBF and polynomial kernels to a non-linearly separable dataset, and understand that kernels implicitly project data to higher dimensions.
The Problem: Non-Linear Data
Many real-world classification problems are not linearly separable — no straight line (or hyperplane) can correctly separate the classes. For example, data arranged in concentric rings cannot be separated by any linear boundary. One approach is to manually create new features (e.g., x², x×y) that make the classes linearly separable in the augmented space. The kernel trick does this automatically and implicitly, without ever computing the coordinates in the high-dimensional space.
Feature Maps: Lifting Data to Higher Dimensions
A feature map φ(x) transforms an input vector into a higher-dimensional representation. For example, φ([x₁, x₂]) = [x₁², √2·x₁x₂, x₂²] maps 2D data to 3D. After this mapping, classes that overlapped in 2D may become linearly separable in 3D. The SVM then finds a maximum-margin hyperplane in the transformed space. The corresponding decision boundary in the original 2D space is a curve, giving the SVM non-linear classification ability.
All lessons in this course
- Maximum Margin Classifier: Support Vectors and Hyperplane
- Soft Margin SVM and the C Parameter
- The Kernel Trick: RBF, Polynomial, and Sigmoid Kernels
- Tuning C and Gamma with a Grid Search