0PricingLogin
Machine Learning Academy · Lesson

Choosing K: Elbow Method and Silhouette Score

Learners will plot inertia vs k (elbow) and compute silhouette coefficients to pick the number of clusters that gives well-separated, compact groups.

Why Choosing k Matters

K-Means requires you to specify k — the number of clusters — before training. Too few clusters and you lump distinct groups together; too many and you split natural groups artificially. There is no universally correct k, but two diagnostic tools — the elbow method and the silhouette score — give principled guidance.

Inertia Decreases as k Grows

As you increase k, inertia always decreases because points are assigned to closer centroids. At k=n (one cluster per point), inertia is zero. This means you cannot simply minimise inertia — you need to find where additional clusters stop providing meaningful reductions. That point of diminishing returns is the elbow.

from sklearn.cluster import KMeans
import numpy as np

X = np.random.randn(200, 2)
inertias = []

for k in range(1, 11):
    km = KMeans(n_clusters=k, random_state=42, n_init=10)
    km.fit(X)
    inertias.append(km.inertia_)

print('Inertia per k:')
for k, inr in enumerate(inertias, start=1):
    print(f'  k={k}: {inr:.1f}')

All lessons in this course

  1. K-Means: Centroids, Assignment, and Update Steps
  2. Choosing K: Elbow Method and Silhouette Score
  3. DBSCAN: Core Points, Border Points, and Noise
  4. Clustering for Customer Segmentation: End-to-End Example
← Back to Machine Learning Academy