Useful Series Methods
Use describe(), value_counts(), unique(), and map() to summarise and transform a Series quickly.
Quick Summary with describe()
s.describe() generates a statistical summary in a single call: count, mean, standard deviation, min, 25th percentile (Q1), median (Q2), 75th percentile (Q3), and max. For string Series it returns count, unique, top, and frequency instead. It is the fastest way to get a feel for a column's distribution when first exploring a dataset.
import pandas as pd
ages = pd.Series([25, 32, 19, 45, 28, 31, 22, 40])
print(ages.describe())
# count 8.000000
# mean 30.250000
# std 8.406...
# min 19.000000
# 25% 24.250000
# 50% 29.500000
# 75% 34.750000
# max 45.000000value_counts() for Frequency Tables
s.value_counts() returns a new Series with the unique values as the index and their occurrence counts as the data, sorted descending by count. Pass normalize=True to get proportions instead of raw counts. This is the first thing to run on a categorical column to understand its distribution.
import pandas as pd
colors = pd.Series(['red', 'blue', 'red', 'green', 'blue', 'red'])
print(colors.value_counts())
# red 3
# blue 2
# green 1
print(colors.value_counts(normalize=True).round(2))
# red 0.50 blue 0.33 green 0.17