Pandas & NumPy Academy · Lesson

apply() with GroupBy

Pass a multi-row function to groupby().apply() to compute complex group-level summaries that agg() cannot express.

Why GroupBy Needs apply()

The built-in agg() method handles simple aggregations like sum, mean, and count — one scalar output per group. But some group-level computations require looking at the entire sub-DataFrame for the group, not just a single column. groupby().apply(func) passes the full group DataFrame to your function and collects the results, enabling complex summaries that agg() cannot express.

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'region': ['North', 'North', 'South', 'South', 'North'],
    'product': ['A', 'B', 'A', 'A', 'C'],
    'revenue': [100, 250, 180, 90, 300]
})
print(df)

Returning a Scalar per Group

When the function passed to groupby().apply() returns a scalar, the result is a Series indexed by the group keys — identical to what agg() produces. This form is useful when the scalar requires multi-column logic, such as computing the ratio of top-product revenue to total group revenue, which cannot be expressed in a single agg() column spec.

def top_product_share(group):
    top = group['revenue'].max()
    total = group['revenue'].sum()
    return top / total

share = df.groupby('region').apply(top_product_share)
print(share)

All lessons in this course

apply() on Columns and Rows
apply() with GroupBy
map() and applymap() for Element-Wise Operations
Method Chaining with pipe()

← Back to Pandas & NumPy Academy