Parsing Dates Correctly
Parse date strings into datetime64 with pd.to_datetime, handle multiple formats, and set a DatetimeIndex.
Why Date Parsing Matters
Date and time data is ubiquitous in real-world datasets — transaction timestamps, birth dates, event logs, and financial periods. When dates are stored as strings (the object dtype), you cannot compute durations, resample by month, or sort chronologically. Converting strings to Pandas' datetime64 dtype unlocks the full time series API and makes temporal analysis possible.
import pandas as pd
df = pd.DataFrame({'date': ['2024-01-15', '2024-03-20', '2024-07-04']})
print(df['date'].dtype) # object
# You cannot do arithmetic on string dates
# df['date'] + pd.Timedelta('1 day') # TypeError!
# After parsing:
df['date'] = pd.to_datetime(df['date'])
print(df['date'].dtype) # datetime64[ns]
print(df['date'] + pd.Timedelta('1 day'))pd.to_datetime() Basics
pd.to_datetime(series) is the primary function for converting string columns to datetime64. It is smart enough to recognise most common date formats automatically (ISO 8601, US date, European date) without requiring a format string. The result is a Series with dtype datetime64[ns] — nanosecond precision, capable of storing timestamps from 1677 to 2262.
import pandas as pd
dates = pd.Series([
'2024-01-15',
'15/01/2024',
'January 15, 2024',
'2024-01-15 08:30:00'
])
parsed = pd.to_datetime(dates)
print(parsed)
# 0 2024-01-15 00:00:00
# 1 2024-01-15 00:00:00
# 2 2024-01-15 00:00:00
# 3 2024-01-15 08:30:00
print(parsed.dtype) # datetime64[ns]All lessons in this course
- Inspecting Column Data Types
- Casting with astype()
- Categorical Data Type
- Parsing Dates Correctly