Last Updated on March 1, 2022 by Jay
There are two ways we can convert a column to datetime in pandas. Datetime is a data type that represents date and time, it’s a convenient data type when dealing with data involving time.
Method 1: Convert from a single data column
We’ll use the below sample dataframe for the demonstration.
Note that:
- Columns a, b, c represent year, month, and day, respectively.
- Both date and date2 column has a dtype (data type) of ‘O’, or Objects, in other words, String.
df = pd.DataFrame({'date':['2022-01-01','2022-02-01','2022-03-01'],
'date2':['2022#01#01','2022#02#01','2022#03#01'],
'a': [2022,2022,2022],
'b': [1,2,3],
'c': [1,1,1] })
The date column has a pretty common format year-month-day, so the pd.to_datetime() method can recognize that without our help.
pd.to_datetime(df['date'])
However, the date2 column has a non-traditional time format with delimiters “#”. Without additional arguments, we’ll get this error message:
ParserError: Unknown string format: 2022#01#01
Let’s help pandas out by providing an argument format. And it works now! This is to show we can use almost any symbol as a delimiter and still be able to convert text to date.
pd.to_datetime(df['date2'], format='%Y#%m#%d')
Method 2: Convert from a list of data columns
Since columns a, b, c contains year, month and day, we can convert those directly to a datetime column in pandas. However, we need to first rename the columns to year, month and day before we can convert. If we don’t rename the column names, it’s not going to work and we’ll hit an error message.
df_date = df.rename(columns={'b':'month','c':'day', 'a':'year'})
NOTE: The input dataframe MUST contain ‘year’, ‘month’, and ‘day’ columns for this method to work.
pd.to_datetime(df_date)
Notice that we don’t even need to put in the ‘year, month, day’ format, as long as those columns are there.
Conclusion
There you have it. Regardless of text or number data, we have a way to convert a column (or columns) into a datetime variable in pandas!
One comment