Last Updated on April 20, 2022 by Jay
Sometimes we need to shift a column inside a pandas dataframe, the shift() method provides a convenient way to achieve that.
Let’s create two dataframes for the demonstration. df contains an alphabetical index, and the df2 contains a datetime index.
import pandas as pd
df = pd.DataFrame({'a':range(0,5),
'b':range(5,10)})
df2 = pd.DataFrame({'a':range(0,5),
'b':range(5,10)},
index=pd.date_range("2020-01-01", "2020-01-05"))
pandas shift() syntax
DataFrame.shift(periods = 1,
freq = None,
axis = 0,
fill_value)
Note the pandas.Series object also has a similar method shift() that allows shifting data for Series.
Shift A Column Up/Down In A Pandas Dataframe
To shift columns down, set the periods to a positive number. To shift up columns, set it to a negative number.
Note that only the data is shifted, and the index remained the same. The data shifted and now there are 2 empty rows, which are automatically filled by the np.nan values.
df
a b
0 0 5
1 1 6
2 2 7
3 3 8
4 4 9
#Shift columns down
df.shift(2)
a b
0 NaN NaN
1 NaN NaN
2 0.0 5.0
3 1.0 6.0
4 2.0 7.0
df2
a b
2020-01-01 0 5
2020-01-02 1 6
2020-01-03 2 7
2020-01-04 3 8
2020-01-05 4 9
#Shift columns up
df2.shift(-2)
a b
2020-01-01 2.0 7.0
2020-01-02 3.0 8.0
2020-01-03 4.0 9.0
2020-01-04 NaN NaN
2020-01-05 NaN NaN
Shift Columns for Time-Series Data
When we work with time-series data, we can shift everything including both index and data by including the freq argument. Note the below example, the index shifted down (forward) 2 days along with all the data. Currently, if we want to use the freq argument, the index must be a datetime type of data, otherwise, pandas will raise a NotImplementedError.
df2.shift(periods = 2, freq='D')
a b
2020-01-03 0 5
2020-01-04 1 6
2020-01-05 2 7
2020-01-06 3 8
2020-01-07 4 9
Shift Columns Left Or Right
We can use the axis argument to control the shift direction. By default, axis = 0 which means shift rows (up or down). Setting axis = 1 will shift the columns left or right.
In the below example, we shifted all data to the right by 1 column. Therefore, the first column becomes empty, which is automatically filled by the np.nan values.
df.shift(periods = 1, axis = 1)
a b
0 NaN 0
1 NaN 1
2 NaN 2
3 NaN 3
4 NaN 4
We can also fill the empty rows/columns using the fill_value argument if we don’t want the NaN values.
df.shift(periods = -1, axis = 1, fill_value = 999)
a b
0 5 999
1 6 999
2 7 999
3 8 999
4 9 999
Pandas.Series shift() method
As mentioned previously, the Series class also has a similar shift() method, which works exactly the same way, except this operates on a Series (i.e. a single column) instead of the entire dataframe. Of course, the resulting data is a pandas Series.
df2['a'].shift(periods = 2, freq='D')
2020-01-03 0
2020-01-04 1
2020-01-05 2
2020-01-06 3
2020-01-07 4
Freq: D, Name: a, dtype: int64
One comment