Pandas & NumPy Academy · Lesson

Method Chaining with pipe()

Write readable data transformation pipelines using pipe() to chain custom functions alongside native Pandas methods.

The Problem with Intermediate Variables

A data cleaning pipeline without pipe() often accumulates many intermediate variables: df1 = clean(df), df2 = transform(df1), df3 = enrich(df2). These variables clutter the namespace, make debugging harder, and tempt developers to reuse them incorrectly. The result is code that is hard to read top-to-bottom as a sequence of transformations.

import pandas as pd

df = pd.read_csv('orders.csv')

# Without pipe — intermediate variables everywhere
df1 = df.dropna(subset=['revenue'])
df2 = df1[df1['quantity'] > 0]
df3 = df2.assign(revenue_per_unit=df2['revenue'] / df2['quantity'])
print(df3.shape)

Introducing pipe()

DataFrame.pipe(func) calls func(df) and returns the result, allowing you to chain custom functions in the same way you chain native Pandas methods like .dropna().query(). The key benefit is that every transformation step is explicit and readable left-to-right (or top-to-bottom when formatted with parentheses), mirroring the logical order of the pipeline.

def drop_nulls(df):
    return df.dropna(subset=['revenue'])

def filter_positive_qty(df):
    return df[df['quantity'] > 0]

def add_revenue_per_unit(df):
    return df.assign(revenue_per_unit=df['revenue'] / df['quantity'])

# With pipe — clean chain
df_clean = (df
    .pipe(drop_nulls)
    .pipe(filter_positive_qty)
    .pipe(add_revenue_per_unit)
)
print(df_clean.shape)

All lessons in this course

apply() on Columns and Rows
apply() with GroupBy
map() and applymap() for Element-Wise Operations
Method Chaining with pipe()

← Back to Pandas & NumPy Academy