Last Updated on July 14, 2022 by Jay
The pandas pivot() method can help turn a dataframe from a long to a wide format. We’ll demonstrate how to use it with a simple example in this tutorial.
For a frequent Excel user, we know right away that we can achieve this by using the pivot table function. Essentially we put the country column as “Rows” and the Month as “Columns”. Then drop the Sales into the table as “Values”. The good news is that there’s also a pivot function in pandas!
Sample Dataset
Copy and paste the below code if you want to follow along. The below will create a “long” form dataframe that looks like the left table on the image above.
import pandas as pd
import numpy as np
#make results replicable
np.random.seed(0)
months = (['Jan','Feb','Mar','Apr','May','Jun'] *4)
months.sort()
long_df = pd.DataFrame({
'country': ['Canada','UK','USA','Australia'] *6,
'Month': months,
'Sales': np.random.randint(1,1000,size=24)
})
Pandas Pivot method Syntax
The pandas pivot method syntax is surprisingly similar to Excel’s pivot function.
pandas.DataFrame.pivot(index=None, columns=None, values=None)
- index: string, or a list of string values
- This is the index of the new dataframe. Equivalent to “Rows” for Excel pivot table.
- columns: string, or a list of string values
- This is the columns of the new dataframe. Equivalent to “Columns” for Excel pivot table.
- values: string, or list of string values
- The column to use for populating values of the new dataframe. Equivalent to “Values” for Excel pivot table.
A frequent Excel user is already smiling – we understand how to use this function like a breeze!
Pandas Pivot to Turn A Long table to Wide Format
Let’s plug the column names into the above syntax. Note the below two lines would return the same results. However, the 2nd line is preferred since it’s more explicit on what to use for the arguments. Always remember, “Explicit is better than implicit.”
long_df.pivot(index='country',columns='Month')
long_df.pivot(index='country',columns='Month', values = 'Sales')
The above result is a little less than perfect – The column names are automatically sorted in alphabetical order, not the month order!
There’s a simple fix to it – just alter the column orders! We can actually chain this part with the pivot method into one line of code.
long_df.pivot(index='country', columns='Month',values='Sales')[['Jan','Feb','Mar','Apr','May','Jun']]
Month Jan Feb Mar Apr May Jun
country
Australia 755 360 706 193 175 473
Canada 10 836 601 685 487 805
UK 724 764 397 560 552 600
USA 278 708 315 630 88 71
Additional Resources
Hey!
Hope you are doing well, I’m happy I found your blog.
I have a question: So does pandas pivot aggregate information and can we specify how? (like count/sum/average etc)? Can we add totals here (also define the method – count/sum/average etc)
Thanks for what you are doing!
Hi Tania,
Thanks for dropping by.
The pandas.pivot() method only turns a “long form” dataframe into a “wide form”, or some people prefer to call it converting from 1D to a 2D table. It doesn’t really aggregate the data. For aggregation like count/sum/avg, you want to use the pandas.pivot_table() method, or the pandas.groupby() method.
For groupby() you can find this tutorial here: https://pythoninoffice.com/summarize-data-with-python-pandas-groupby-function
For pivot_table(), that’s actually on my next to-do list! I’ll update my response here once that one is complete.