Last Updated on March 1, 2022 by Jay
You might see this error message “ValueError: to assemble mappings requires at least that [year, month, day] be specified: [day,month,year] is missing” when trying to convert data into datetime in pandas.
The probable cause of this error is that you are trying to use pd.to_datetime() method to convert some data into a datetime date type.
Sample Dataset
We’ll use the below sample dataframe that contains three columns of integer numbers. They present:
- ‘a’ – year
- ‘b’ – month
- ‘c’ day
import pandas as pd
df = pd.DataFrame({'a': [2022,2022,2022],
'b': [1,2,3],
'c': [1,1,1] })
We know that we can convert data into datetime object in pandas, so if we try pd.to_datetime(df). this is what will happen:
>>>pd.to_datetime(df) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) ~\AppData\Local\Temp/ipykernel_29580/2795732595.py in <module> ----> 1 pd.to_datetime(df) ~\Desktop\PythonInOffice\pandas_profiling\venv\lib\site-packages\pandas\core\tools\datetimes.py in to_datetime(arg, errors, dayfirst, yearfirst, utc, format, exact, unit, infer_datetime_format, origin, cache) 888 result = arg._constructor(values, index=arg.index, name=arg.name) 889 elif isinstance(arg, (ABCDataFrame, abc.MutableMapping)): --> 890 result = _assemble_from_unit_mappings(arg, errors, tz) 891 elif isinstance(arg, Index): 892 cache_array = _maybe_cache(arg, format, cache, convert_listlike) ~\Desktop\PythonInOffice\pandas_profiling\venv\lib\site-packages\pandas\core\tools\datetimes.py in _assemble_from_unit_mappings(arg, errors, tz) 994 if len(req): 995 _required = ",".join(req) --> 996 raise ValueError( 997 "to assemble mappings requires at least that " 998 f"[year, month, day] be specified: [{_required}] is missing" ValueError: to assemble mappings requires at least that [year, month, day] be specified: [day,month,year] is missing
How to fix it
This is literally saying “you are missing the ‘year’, ‘month’ and ‘day’ columns! So let’s add these names to the dataframe by changing the existing column names.
df.rename(columns = {'a':'year','b':'month','c':'day'}, inplace=True)
pd.to_datetime(df)
Another way to change the column names is by altering the df.columns attribute. Note this way we are modifying the column names directly, so there’s no need to use ‘inplace’ anywhere.
df.columns
Index(['a', 'b', 'c'], dtype='object')
df.columns = ['year','month','day']
Index(['year', 'month', 'day'], dtype='object')
Then, by calling pd.to_datetime(df) will convert the dataframe to a datetime data type column.