Splitting and Replacing Strings
Split strings into multiple columns with .str.split(expand=True) and replace substrings with .str.replace().
Splitting Strings into a List
.str.split(sep) splits each string in a Series at every occurrence of the separator and returns a Series of lists. Without expand=True, each element is a Python list, which is useful for counting tokens or further list-level processing. The separator can be a literal string or a regex pattern.
import pandas as pd
df = pd.DataFrame({'tags': ['python,data,pandas', 'ml,ai', 'sql,db,query']})
# Split into a list of strings
df['tag_list'] = df['tags'].str.split(',')
print(df['tag_list'])
# 0 [python, data, pandas]
# 1 [ml, ai]
# 2 [sql, db, query]
# Count tags per row
df['tag_count'] = df['tag_list'].str.len()
print(df['tag_count'].tolist()) # [3, 2, 3]expand=True to Split into Columns
Passing expand=True to .str.split() returns a DataFrame instead of a Series of lists, where each split token becomes its own column (column 0, 1, 2, …). This is the standard way to widen a delimited column — for example, splitting a 'first last' full name into separate first and last name columns.
import pandas as pd
df = pd.DataFrame({'full_name': ['Alice Smith', 'Bob Jones', 'Carol White']})
# Split into two columns
name_parts = df['full_name'].str.split(' ', expand=True)
name_parts.columns = ['first_name', 'last_name']
df = pd.concat([df, name_parts], axis=1)
print(df)
# full_name first_name last_name
# 0 Alice Smith Alice Smith
# 1 Bob Jones Bob Jones
# 2 Carol White Carol WhiteAll lessons in this course
- The .str Accessor
- Splitting and Replacing Strings
- Pattern Matching with Regex
- Combining and Cleaning Text Columns