DBSCAN: Core Points, Border Points, and Noise
Learners will configure eps and min_samples, identify core, border, and noise points on a crescent-shaped dataset, and see DBSCAN discover non-convex clusters that K-Means misses.
Why K-Means Fails on Arbitrary Shapes
K-Means assumes clusters are convex and roughly spherical. It fails on crescent, ring, or elongated shapes because it partitions by distance to centroids. DBSCAN (Density-Based Spatial Clustering of Applications with Noise) overcomes this by defining clusters as dense regions separated by low-density areas, discovering clusters of any shape.
Two Key Hyperparameters: eps and min_samples
DBSCAN is controlled by two parameters: eps (epsilon) defines the radius of a neighbourhood around a point, and min_samples sets the minimum number of points (including the point itself) required within that radius to be considered a dense region. Together they determine which points are cores, borders, or noise.
All lessons in this course
- K-Means: Centroids, Assignment, and Update Steps
- Choosing K: Elbow Method and Silhouette Score
- DBSCAN: Core Points, Border Points, and Noise
- Clustering for Customer Segmentation: End-to-End Example