Creating a MultiIndex
Build a hierarchical row index with pd.MultiIndex.from_tuples or set_index on multiple columns, and inspect its levels.
What Is a MultiIndex?
A MultiIndex (also called a hierarchical index) allows a Pandas DataFrame or Series to have multiple levels of row labels. Think of it as a composite key in a database: instead of identifying a row by a single label, you identify it by a tuple like (country, city) or (year, quarter). MultiIndexes are essential for representing panel data, cross-sectional time series, and any dataset with a natural two-level grouping structure.
import pandas as pd
# A simple example: sales by country and city
data = {
'sales': [100, 200, 150, 300, 80, 120]
}
index = pd.MultiIndex.from_tuples(
[('USA', 'New York'), ('USA', 'Chicago'),
('UK', 'London'), ('UK', 'Manchester'),
('DE', 'Berlin'), ('DE', 'Munich')],
names=['country', 'city']
)
df = pd.DataFrame(data, index=index)
print(df)Creating a MultiIndex with from_tuples
pd.MultiIndex.from_tuples(list_of_tuples, names=) is the most explicit way to create a MultiIndex. Each tuple becomes one row label, and its elements become the levels. The names parameter assigns a label to each level (e.g. ['year', 'quarter']). This method is useful when you have a pre-built list of composite keys that you want to use as the row index.
import pandas as pd
# Multi-level time index: year x quarter
tuples = [
(2022, 'Q1'), (2022, 'Q2'), (2022, 'Q3'), (2022, 'Q4'),
(2023, 'Q1'), (2023, 'Q2'), (2023, 'Q3'), (2023, 'Q4')
]
mi = pd.MultiIndex.from_tuples(tuples, names=['year', 'quarter'])
revenue = [120, 135, 145, 160, 130, 148, 162, 175]
df = pd.Series(revenue, index=mi, name='revenue_M')
print(df)