0Pricing
Pandas & NumPy Academy · Lesson

Heatmaps for Correlation Matrices

Compute a correlation matrix with DataFrame.corr() and visualise it as a colour-coded heatmap with sns.heatmap.

What Is a Correlation Matrix?

A correlation matrix is a square table where entry (i, j) contains the Pearson correlation coefficient between column i and column j. Values range from -1 (perfect negative linear correlation) to +1 (perfect positive), with 0 meaning no linear relationship. The diagonal is always 1 (a variable is perfectly correlated with itself). Correlation matrices are the first step in understanding which variables move together before building a model.

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Compute correlation matrix for numeric columns
tips = sns.load_dataset('tips')
corr = tips[['total_bill', 'tip', 'size']].corr()
print(corr.round(2))

Computing the Correlation Matrix

Call DataFrame.corr() to compute pairwise Pearson correlations for all numeric columns. The method parameter controls which correlation to compute: 'pearson' (default, linear), 'spearman' (rank-based, robust to outliers and non-linear monotonic relationships), or 'kendall'. For skewed financial or count data, Spearman is often more informative than Pearson.

import pandas as pd
import seaborn as sns

tips = sns.load_dataset('tips')
numeric = tips.select_dtypes(include='number')

# Pearson vs Spearman
pearson = numeric.corr(method='pearson')
spearman = numeric.corr(method='spearman')

print('Pearson:')
print(pearson.round(2))
print('\nSpearman:')
print(spearman.round(2))

All lessons in this course

  1. Distribution Plots: histplot and kdeplot
  2. Categorical Plots: boxplot, barplot, violinplot
  3. Scatter Plots and Pair Plots
  4. Heatmaps for Correlation Matrices
← Back to Pandas & NumPy Academy