Pandas & NumPy Academy · Lesson

pivot_table: Cross-Tabulation

Create Excel-style pivot tables with pd.pivot_table, setting index, columns, values, and aggfunc.

What Is a Pivot Table?

A pivot table is a cross-tabulation that summarises data by two categorical dimensions — one forms the rows, another forms the columns — with an aggregated value in each cell. If you have ever built a pivot table in Excel, you will recognise the concept immediately. Pandas provides pd.pivot_table() (and the DataFrame method version) to create these summaries entirely in Python.

Basic pivot_table Call

The four key parameters of pd.pivot_table() are: values (the column to aggregate), index (which column becomes the rows), columns (which column becomes the columns), and aggfunc (the aggregation function, defaulting to 'mean'). The result is a DataFrame where each cell contains the aggregated value for that row-column combination.

import pandas as pd

df = pd.DataFrame({
    'region':  ['East', 'West', 'East', 'West', 'East', 'West'],
    'product': ['A',    'A',    'B',    'B',    'A',    'B'],
    'revenue': [200,    340,    150,    290,    310,    410]
})

table = pd.pivot_table(df,
                       values='revenue',
                       index='region',
                       columns='product',
                       aggfunc='sum')
print(table)
# product    A    B
# region
# East     510  150
# West     340  700

All lessons in this course

← Back to Pandas & NumPy Academy