Pandas & NumPy Academy · Lesson

Setting and Resetting the Index

Promote a column to the row index with set_index(), reset it back with reset_index(), and use a MultiIndex overview.

What Is the DataFrame Index?

Every Pandas DataFrame has a row index — a set of labels attached to each row. The default index is a RangeIndex (0, 1, 2, …), but you can replace it with any column that acts as a natural identifier: a date, a product ID, a user ID, or any unique key. A meaningful index improves readability, enables label-based slicing, and is required for time series resampling.

import pandas as pd

df = pd.DataFrame({
    'date': ['2024-01-01', '2024-01-02', '2024-01-03'],
    'sales': [100, 200, 150]
})
print('Default index:', df.index.tolist())  # [0, 1, 2]
print(df)

set_index() — Promoting a Column

DataFrame.set_index('col') promotes the specified column to become the row index, removing it from the regular columns. The column's values become row labels. This is the standard way to make a DataFrame more expressive when one column acts as a natural key. A new DataFrame is returned; the original is unchanged unless inplace=True is used.

import pandas as pd

df = pd.DataFrame({
    'date': ['2024-01-01', '2024-01-02', '2024-01-03'],
    'sales': [100, 200, 150]
})

df = df.set_index('date')
print(df)
#             sales
# date
# 2024-01-01    100
# 2024-01-02    200
# 2024-01-03    150

print(df.index)  # Index(['2024-01-01', ...], dtype='object', name='date')

All lessons in this course

← Back to Pandas & NumPy Academy