Convert column type from string to datetime format in Pandas dataframe
To perform time-series operations, dates should be in the correct format. Let's learn how to convert a Pandas DataFrame column of strings to datetime format.
Pandas Convert Column To DateTime using pd.to_datetime()
pd.to_datetime() function in Pandas is the most effective way to handle this conversion. This is used to handle different formats and to convert string columns to datetime objects.
# Importing pandas
import pandas as pd
# Creating a DataFrame
df = pd.DataFrame({
'Date': ['11/8/2011', '04/23/2008', '10/2/2019'],
'Event': ['Music', 'Poetry', 'Theatre'],
'Cost': [10000, 5000, 15000]
})
# Display initial data types
print("Before Conversion:")
print(df.info())
# Converting 'Date' column to datetime
df['Date'] = pd.to_datetime(df['Date'])
# Display data types after conversion
print("\nAfter Conversion:")
print(df.info())
Output:
Before Conversion:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 3 non-null object
1 Event 3 non-null object
2 Cost 3 non-null int64
dtypes: int64(1), object(2)
memory usage: 200.0+ bytes
None
After Conversion:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 3 non-null datetime64[ns]
1 Event 3 non-null object
2 Cost 3 non-null int64
dtypes: datetime64[ns](1), int64(1), object(1)
memory usage: 200.0+ bytes
None
Before conversion, the Date column has the data type object (string). After using pd.to_datetime(), it changes to datetime64[ns]. Now, let' s explore how can we can convert columns from string-based formats like yymmdd to yyyymmdd using Pandas.
Converting from 'yymmdd' Format
If a column contains dates in the yymmdd format, you can use pd.to_datetime() with the appropriate format to convert it to datetime64[ns].
# Importing pandas library
import pandas as pd
# Initializing the dataset
player_list = [
['200712', 50000],
['200714', 51000],
['200716', 51500],
['200719', 53000],
['200721', 54000],
['200724', 55000],
['200729', 57000]
]
# Creating a pandas DataFrame
df = pd.DataFrame(player_list, columns=['Dates', 'Patients'])
# Displaying the DataFrame and its data types before conversion
print("Before Conversion:")
print(df)
print(df.dtypes)
# Converting 'Dates' from 'yymmdd' to datetime format
df['Dates'] = pd.to_datetime(df['Dates'], format='%y%m%d')
# Displaying the DataFrame and its data types after conversion
print("\nAfter Conversion:")
print(df)
print(df.dtypes)
Output:
Before Conversion:
Dates Patients
0 200712 50000
1 200714 51000
2 200716 51500
3 200719 53000
4 200721 54000
5 200724 55000
6 200729 57000
Dates object
Patients int64
dtype: object
After Conversion:
Dates Patients
0 2020-07-12 50000
1 2020-07-14 51000
2 2020-07-16 51500
3 2020-07-19 53000
4 2020-07-21 54000
5 2020-07-24 55000
6 2020-07-29 57000
Dates datetime64[ns]
Patients int64
dtype: object
Converting from 'yyyymmdd' Format
When the column is already in yyyymmdd format, use pd.to_datetime() with the %Y%m%d format to convert it directly to datetime64[ns].
import pandas as pd
# Initializing the dataset
player_list = [
['20200712', 50000, '20200812'],
['20200714', 51000, '20200814'],
['20200716', 51500, '20200816'],
['20200719', 53000, '20200819'],
['20200721', 54000, '20200821'],
['20200724', 55000, '20200824'],
['20200729', 57000, '20200824']
]
# Creating a pandas DataFrame
df = pd.DataFrame(player_list, columns=['Treatment_start', 'No_of_Patients', 'Treatment_end'])
# Displaying the DataFrame and its data types before conversion
print("Before Conversion:")
print(df)
print(df.dtypes)
# Converting 'Treatment_start' and 'Treatment_end' to datetime format
df['Treatment_start'] = pd.to_datetime(df['Treatment_start'], format='%Y%m%d')
df['Treatment_end'] = pd.to_datetime(df['Treatment_end'], format='%Y%m%d')
# Displaying the DataFrame and its data types after conversion
print("\nAfter Conversion:")
print(df)
print(df.dtypes)
Output:
Convert Column To DateTime using DataFrame.astype() function
Here’s how you can create a DataFrame, inspect its data types, and convert the string-based date column to datetime using DataFrame.astype().
# Importing pandas
import pandas as pd
# Creating the DataFrame
df = pd.DataFrame({
'Date': ['11/8/2011', '04/23/2008', '10/2/2019'],
'Event': ['Music', 'Poetry', 'Theatre'],
'Cost': [10000, 5000, 15000]
})
# Checking the initial data type of the 'Date' column
print("Before Conversion:")
df.info()
Output:

The Date column initially has the data type object, indicating it contains strings.
# Converting the 'Date' column to datetime format using astype()
df['Date'] = df['Date'].astype('datetime64[ns]')
# Checking the data type after conversion
print("\nAfter Conversion:")
df.info()
Output:

Using astype('datetime64[ns]')** successfully converts the Date column to the **datetime64[ns]` format.