Profiling with timeit and memory_profiler
Measure execution time with %timeit and peak memory with memory_profiler to identify the slowest parts of your pipeline.
Why Profile Before Optimising?
A common mistake is to optimise code intuitively — rewriting a function that looks slow without measuring whether it is actually the bottleneck. In data pipelines, 90% of the execution time often comes from 10% of the code. Profiling identifies exactly where time and memory are consumed so you can focus optimisation effort where it matters. The golden rule: measure first, optimise second. Tools like timeit measure time and memory_profiler measures RAM.
import pandas as pd
import numpy as np
import timeit
# Naive approach vs vectorised approach — which is faster?
np.random.seed(0)
df = pd.DataFrame({'a': np.random.randn(100000), 'b': np.random.randn(100000)})
t1 = timeit.timeit(lambda: df['a'] + df['b'], number=100)
t2 = timeit.timeit(lambda: [a + b for a, b in zip(df['a'], df['b'])], number=10)
print(f'Vectorised (100 runs): {t1:.3f}s')
print(f'List comprehension (10 runs): {t2:.3f}s')
print(f'Per-run speedup: ~{(t2/10)/(t1/100):.0f}x')timeit: Python's Built-In Timer
The timeit module measures execution time by running a statement many times and returning the total elapsed time. timeit.timeit(stmt, number=n) runs stmt n times and returns total seconds. Divide by number to get per-run time. For accurate measurements, choose number so the total run takes at least 0.1 seconds — a single run of a fast operation is not representative due to system scheduling noise.
import timeit
import pandas as pd
import numpy as np
np.random.seed(0)
df = pd.DataFrame({'x': np.random.randn(10000)})
# Compare three ways to square a column
time_a = timeit.timeit(lambda: df['x'] ** 2, number=1000)
time_b = timeit.timeit(lambda: df['x'].apply(lambda v: v**2), number=100)
time_c = timeit.timeit(lambda: np.square(df['x']), number=1000)
print(f'Operator **2 per run: {time_a/1000*1000:.3f} ms')
print(f'.apply(v**2) per run: {time_b/100*1000:.3f} ms')
print(f'np.square() per run: {time_c/1000*1000:.3f} ms')All lessons in this course
- Profiling with timeit and memory_profiler
- Avoiding iterrows and Python Loops
- Efficient Data Types for Memory Reduction
- Chunked Reading for Large Files