apply() with GroupBy
Pass a multi-row function to groupby().apply() to compute complex group-level summaries that agg() cannot express.
Why GroupBy Needs apply()
The built-in agg() method handles simple aggregations like sum, mean, and count — one scalar output per group. But some group-level computations require looking at the entire sub-DataFrame for the group, not just a single column. groupby().apply(func) passes the full group DataFrame to your function and collects the results, enabling complex summaries that agg() cannot express.
import pandas as pd
import numpy as np
df = pd.DataFrame({
'region': ['North', 'North', 'South', 'South', 'North'],
'product': ['A', 'B', 'A', 'A', 'C'],
'revenue': [100, 250, 180, 90, 300]
})
print(df)Returning a Scalar per Group
When the function passed to groupby().apply() returns a scalar, the result is a Series indexed by the group keys — identical to what agg() produces. This form is useful when the scalar requires multi-column logic, such as computing the ratio of top-product revenue to total group revenue, which cannot be expressed in a single agg() column spec.
def top_product_share(group):
top = group['revenue'].max()
total = group['revenue'].sum()
return top / total
share = df.groupby('region').apply(top_product_share)
print(share)