pd.concat for Stacking DataFrames
Stack DataFrames vertically or horizontally with pd.concat, align on index or columns, and handle duplicate indices.
Why Concatenate DataFrames?
In real projects, data rarely arrives in a single file. You might have monthly sales files, regional survey exports, or database query results split across multiple queries. Concatenation stacks these separate DataFrames into one combined table. Pandas provides pd.concat() for this purpose, handling both vertical (row-wise) and horizontal (column-wise) stacking.
Basic Vertical Concatenation
The most common use of pd.concat() is stacking DataFrames vertically (adding more rows). Pass a list of DataFrames and the function appends them one below the other. Both DataFrames must have compatible columns for a clean result. Pandas aligns on column names automatically, filling missing columns with NaN.
import pandas as pd
jan = pd.DataFrame({'product': ['A', 'B'], 'sales': [100, 200]})
feb = pd.DataFrame({'product': ['A', 'C'], 'sales': [150, 120]})
combined = pd.concat([jan, feb])
print(combined)
# product sales
# 0 A 100
# 1 B 200
# 0 A 150
# 1 C 120All lessons in this course
- pd.concat for Stacking DataFrames
- pd.merge: Inner and Outer Joins
- Left and Right Joins
- Joining on the Index