Casting with astype()
Convert columns to int, float, string, boolean, and datetime using astype() and handle common conversion errors.
Introduction to astype()
Series.astype(dtype) converts a column from its current type to the type you specify. It returns a new Series (or DataFrame when called on a full DataFrame) without modifying the original. This is the primary Pandas tool for fixing dtype problems after loading data, and it can convert between numeric types, to strings, to booleans, and to the Categorical type.
import pandas as pd
df = pd.DataFrame({'age': ['25', '30', '35']})
print('Before:', df['age'].dtype) # object
df['age'] = df['age'].astype(int)
print('After:', df['age'].dtype) # int64
print(df['age'] + 1) # [26, 31, 36] — arithmetic now worksConverting to Numeric Types
The most common conversion is from object to a numeric type. You can cast to 'int64', 'int32', 'float64', 'float32', etc. If any value in the column cannot be converted, astype() raises a ValueError. Use pd.to_numeric(series, errors='coerce') instead when the column may contain non-numeric strings — it converts bad values to NaN rather than crashing.
import pandas as pd
df = pd.DataFrame({'price': ['10.5', '20.0', 'N/A', '30.5']})
# astype would crash on 'N/A'
# price_float = df['price'].astype(float) # ValueError!
# pd.to_numeric with errors='coerce' is safer
df['price_float'] = pd.to_numeric(df['price'], errors='coerce')
print(df)
# price price_float
# 0 10.5 10.5
# 1 20.0 20.0
# 2 N/A NaN
# 3 30.5 30.5All lessons in this course
- Inspecting Column Data Types
- Casting with astype()
- Categorical Data Type
- Parsing Dates Correctly