Pandas & NumPy Academy · Lesson

Profiling with timeit and memory_profiler

Measure execution time with %timeit and peak memory with memory_profiler to identify the slowest parts of your pipeline.

Why Profile Before Optimising?

A common mistake is to optimise code intuitively — rewriting a function that looks slow without measuring whether it is actually the bottleneck. In data pipelines, 90% of the execution time often comes from 10% of the code. Profiling identifies exactly where time and memory are consumed so you can focus optimisation effort where it matters. The golden rule: measure first, optimise second. Tools like timeit measure time and memory_profiler measures RAM.

import pandas as pd
import numpy as np
import timeit

# Naive approach vs vectorised approach — which is faster?
np.random.seed(0)
df = pd.DataFrame({'a': np.random.randn(100000), 'b': np.random.randn(100000)})

t1 = timeit.timeit(lambda: df['a'] + df['b'], number=100)
t2 = timeit.timeit(lambda: [a + b for a, b in zip(df['a'], df['b'])], number=10)

print(f'Vectorised (100 runs): {t1:.3f}s')
print(f'List comprehension (10 runs): {t2:.3f}s')
print(f'Per-run speedup: ~{(t2/10)/(t1/100):.0f}x')

timeit: Python's Built-In Timer

The timeit module measures execution time by running a statement many times and returning the total elapsed time. timeit.timeit(stmt, number=n) runs stmt n times and returns total seconds. Divide by number to get per-run time. For accurate measurements, choose number so the total run takes at least 0.1 seconds — a single run of a fast operation is not representative due to system scheduling noise.

import timeit
import pandas as pd
import numpy as np

np.random.seed(0)
df = pd.DataFrame({'x': np.random.randn(10000)})

# Compare three ways to square a column
time_a = timeit.timeit(lambda: df['x'] ** 2, number=1000)
time_b = timeit.timeit(lambda: df['x'].apply(lambda v: v**2), number=100)
time_c = timeit.timeit(lambda: np.square(df['x']), number=1000)

print(f'Operator **2  per run: {time_a/1000*1000:.3f} ms')
print(f'.apply(v**2) per run: {time_b/100*1000:.3f} ms')
print(f'np.square()  per run: {time_c/1000*1000:.3f} ms')

All lessons in this course

Profiling with timeit and memory_profiler
Avoiding iterrows and Python Loops
Efficient Data Types for Memory Reduction
Chunked Reading for Large Files

← Back to Pandas & NumPy Academy