0Pricing
Pandas & NumPy Academy · Lesson

Joining on the Index

Use DataFrame.join() and set left_index/right_index=True in merge() to combine DataFrames aligned by their index.

Index-Based Joins

So far you have merged DataFrames by matching values in regular columns. But sometimes the information you want to match on is stored in the DataFrame's index rather than in a column. Pandas supports index-based joins through DataFrame.join() and through pd.merge() with left_index=True or right_index=True parameters.

DataFrame.join() Method

DataFrame.join(other) is a convenience method that joins on the index of both DataFrames by default. It is equivalent to a pd.merge() call with left_index=True, right_index=True. The default join type is 'left', unlike pd.merge() which defaults to 'inner'. Both DataFrames must share meaningful index labels for the join to produce correct results.

import pandas as pd

profiles = pd.DataFrame(
    {'age': [25, 30, 22]},
    index=['alice', 'bob', 'carol']
)

scores = pd.DataFrame(
    {'score': [88, 95, 70]},
    index=['alice', 'carol', 'dave']
)

# join: left join on index (default)
result = profiles.join(scores)
print(result)
#        age  score
# alice   25   88.0
# bob     30    NaN   <- bob has no score
# carol   22   95.0

All lessons in this course

  1. pd.concat for Stacking DataFrames
  2. pd.merge: Inner and Outer Joins
  3. Left and Right Joins
  4. Joining on the Index
← Back to Pandas & NumPy Academy