Rank and Percentile within Groups
Compute within-group ranks using groupby().rank() and create percentile buckets with pd.qcut for relative comparisons.
Why Rank Within Groups?
Raw values are often less meaningful than relative rankings. A sales rep with $50,000 in monthly revenue is a top performer if the average is $30,000, but a poor performer if the average is $80,000. By computing ranks within groups (e.g. rank within each region or rank within each quarter), you get a normalised comparison that accounts for different baselines across groups. Pandas makes within-group ranking easy with groupby().rank().
import pandas as pd
import numpy as np
np.random.seed(42)
df = pd.DataFrame({
'rep': ['Alice', 'Bob', 'Carol', 'Dave', 'Eve',
'Frank', 'Grace', 'Hank', 'Iris', 'Jake'],
'region': ['East']*5 + ['West']*5,
'sales': np.random.randint(30000, 100000, 10)
})
print(df.sort_values('region'))Series.rank() — Basic Ranking
Series.rank() assigns a rank to each value: rank 1 is the smallest, rank n is the largest. The ascending=False parameter reverses the direction so rank 1 is the largest. The result is a float Series (not integer) because ties are resolved by averaging the tied ranks by default. For a column with 5 values, ranks range from 1.0 to 5.0 — or with ties, some values may share a rank like 2.5.
import pandas as pd
s = pd.Series([80, 45, 90, 45, 70])
print('Values:', s.values)
print('Rank (ascending=True, default):', s.rank().values)
print('Rank (ascending=False — best=1):', s.rank(ascending=False).values)
# Note: two 45s share ranks 1 and 2 → both get 1.5All lessons in this course
- Rolling Windows
- Expanding Windows
- Exponentially Weighted Moving Average
- Rank and Percentile within Groups