0Pricing
Pandas & NumPy Academy · Lesson

The .str Accessor

Access string methods on a Series using .str and apply lower(), upper(), strip(), and len() across all values.

Introduction to the .str Accessor

Python strings have many useful methods like .lower(), .strip(), and .replace(), but you cannot call them directly on a Pandas Series of strings — you would need a loop. The .str accessor solves this by exposing vectorised versions of Python's built-in string methods on a Series, applying the operation to every element at once without writing a loop and handling NaN gracefully (it returns NaN for missing values).

import pandas as pd

names = pd.Series(['  Alice ', 'BOB', '  carol  '])

# .strip() removes leading/trailing whitespace from every element
print(names.str.strip())
# 0    Alice
# 1      BOB
# 2    carol

# .lower() lowercases every element
print(names.str.strip().str.lower())
# 0    alice
# 1      bob
# 2    carol

Case Conversion Methods

Inconsistent capitalisation is one of the most common data quality issues. The .str accessor provides .lower(), .upper(), .title() (first letter of each word capitalised), and .capitalize() (only first letter of the string). Use .lower() before comparisons and groupby operations to avoid treating 'NYC' and 'nyc' as different groups.

import pandas as pd

df = pd.DataFrame({'city': ['new york', 'LOS ANGELES', 'Chicago', 'HOUSTON']})

df['city_lower'] = df['city'].str.lower()
df['city_title'] = df['city'].str.title()
df['city_upper'] = df['city'].str.upper()

print(df[['city_lower', 'city_title']])
#    city_lower   city_title
# 0   new york     New York
# 1  los angeles  Los Angeles
# 2    chicago      Chicago
# 3    houston      Houston

All lessons in this course

  1. The .str Accessor
  2. Splitting and Replacing Strings
  3. Pattern Matching with Regex
  4. Combining and Cleaning Text Columns
← Back to Pandas & NumPy Academy