Pandas & NumPy Academy · Lesson

Sorting by Index

Reorder rows by their index label with sort_index() and understand when a sorted index improves performance.

Understanding the DataFrame Index

Every Pandas DataFrame has a row index — a set of labels used to identify and access rows. By default, this is a RangeIndex (0, 1, 2, …), but you can set it to any column (dates, names, IDs) using set_index(). When the index is meaningful (e.g., a DatetimeIndex or a customer ID), sorting by it rather than by a column value produces a logically organised output.

import pandas as pd

df = pd.DataFrame(
    {'value': [10, 20, 30]},
    index=['C', 'A', 'B']  # out-of-order alphabetic index
)
print(df)
#    value
# C     10
# A     20
# B     30

sort_index() — Ascending

DataFrame.sort_index() reorders rows by their index label rather than by column values. By default, sorting is ascending — alphabetically for string indices, numerically for integer indices, and chronologically for DatetimeIndex. This is the standard way to restore a dataset to a natural order after shuffling or appending records out of sequence.

import pandas as pd

df = pd.DataFrame(
    {'temp': [22.5, 19.0, 25.1, 18.3]},
    index=pd.to_datetime(['2024-03-01', '2024-01-15', '2024-06-10', '2024-01-01'])
)

sorted_df = df.sort_index()
print(sorted_df)
#             temp
# 2024-01-01  18.3
# 2024-01-15  19.0
# 2024-03-01  22.5
# 2024-06-10  25.1

All lessons in this course

← Back to Pandas & NumPy Academy