Change Column Order In Pandas DataFrame

Sharing is caring!

Last Updated on January 31, 2022 by Jay

In this tutorial, we’ll learn how to change column order in a pandas dataframe. Some might think that the order of the data columns doesn’t matter too much as long as the data is there. Although this is true while working with data inside Python/pandas, once we export results into Excel, column order becomes important. We want to make sure the column order follows a logical sense so the audience can understand the results easily.

Let’s start by creating a sample dataset using pandas and numpy libraries.

The below dataframe construction uses a dictionary comprehension technique, aka “one line for-loop” to quickly create a dataframe with a size of 10 rows X 5 columns. To learn more about this neat looping technique, check out this article.

import pandas as pd
import numpy as np

df = pd.DataFrame(
    {f'col_{str(i)}': np.random.rand(5) for i in range(1,6)}
    )

>>> df
      col_1     col_2     col_3     col_4     col_5
0  0.182444  0.329306  0.356124  0.573469  0.377775
1  0.166817  0.800108  0.876973  0.617122  0.370871
2  0.121286  0.308944  0.611156  0.059454  0.589542
3  0.365968  0.973306  0.515767  0.797248  0.512052
4  0.139206  0.415332  0.836222  0.486267  0.779764

Change column order by pandas dataframe indexing

Remember that we can reference multiple columns by passing a list of column names into the dataframe? For example df[ [ 'col_1', 'col_2', 'col_3' ] ] will give us a dataframe that contains three columns, namely “col_1”, “col_2”, and “col_3”. The best part is that the column order of the resulting dataframe will be exactly the order that you typed those names in. In the below example, pay attention to the order of the list inside df[], as well as the column order of the resulting dataframe.

>>> df[['col_2','col_1','col_5','col_3','col_4']]

      col_2     col_1     col_5     col_3     col_4
0  0.329306  0.182444  0.377775  0.356124  0.573469
1  0.800108  0.166817  0.370871  0.876973  0.617122
2  0.308944  0.121286  0.589542  0.611156  0.059454
3  0.973306  0.365968  0.512052  0.515767  0.797248
4  0.415332  0.139206  0.779764  0.836222  0.486267

Reversing columns by using df.columns

df.columns return the column labels of the dataframe. This is also a convenient way to examine what columns are available inside a dataframe.

>>> df.columns
Index(['col_1', 'col_2', 'col_3', 'col_4', 'col_5'], dtype='object')

We can reverse the dataframe column order by playing with the .columns attribute. When we have a list or list-type object, we can reverse it simply with [::-1]. Then we can pass the reversed column names into the dataframe to reverse the columns of a dataframe.

>>> df.columns[::-1]
Index(['col_5', 'col_4', 'col_3', 'col_2', 'col_1'], dtype='object')

>>> df[df.columns[::-1]]
      col_5     col_4     col_3     col_2     col_1
0  0.377775  0.573469  0.356124  0.329306  0.182444
1  0.370871  0.617122  0.876973  0.800108  0.166817
2  0.589542  0.059454  0.611156  0.308944  0.121286
3  0.512052  0.797248  0.515767  0.973306  0.365968
4  0.779764  0.486267  0.836222  0.415332  0.139206

Change column order by using pandas method .reindex()

The df.reindex() method can shuffle around both rows or columns.

To change the column order: include the columns argument in the method. Note this method doesn’t have an inplace argument, so we can’t modify the dataframe inplace.

>>> df.reindex(columns = ['col_2','col_1','col_4','col_3','col_5'])
      col_2     col_1     col_4     col_3     col_5
0  0.329306  0.182444  0.573469  0.356124  0.377775
1  0.800108  0.166817  0.617122  0.876973  0.370871
2  0.308944  0.121286  0.059454  0.611156  0.589542
3  0.973306  0.365968  0.797248  0.515767  0.512052
4  0.415332  0.139206  0.486267  0.836222  0.779764

On a side note, to change order for rows, simply include the index argument.

>>> df2.re>>> df.reindex(index=[4,3,2,1,0])
      col_1     col_2     col_3     col_4     col_5
4  0.139206  0.415332  0.836222  0.486267  0.779764
3  0.365968  0.973306  0.515767  0.797248  0.512052
2  0.121286  0.308944  0.611156  0.059454  0.589542
1  0.166817  0.800108  0.876973  0.617122  0.370871
0  0.182444  0.329306  0.356124  0.573469  0.377775

Sorting column order by .sort_index()

The .sort_index() method also can sort either columns or rows of a dataframe.

  • Argument axis = 1 will sort by columns
  • Argument axis = 0 will sort by rows/index

Let’s create a df2 with un-ordered columns:

df2 = df[['col_2','col_1','col_5','col_3','col_4']]

>>> df2
      col_2     col_1     col_5     col_3     col_4
0  0.329306  0.182444  0.377775  0.356124  0.573469
1  0.800108  0.166817  0.370871  0.876973  0.617122
2  0.308944  0.121286  0.589542  0.611156  0.059454
3  0.973306  0.365968  0.512052  0.515767  0.797248
4  0.415332  0.139206  0.779764  0.836222  0.486267

To sort the column by order:

>>> df2.sort_index(axis = 1, ascending=True)
      col_1     col_2     col_3     col_4     col_5
0  0.182444  0.329306  0.356124  0.573469  0.377775
1  0.166817  0.800108  0.876973  0.617122  0.370871
2  0.121286  0.308944  0.611156  0.059454  0.589542
3  0.365968  0.973306  0.515767  0.797248  0.512052
4  0.139206  0.415332  0.836222  0.486267  0.779764

The argument ascending=True is optional, and the default is True. We can also sort the columns by descending order:

>>> df2.sort_index(axis = 1, ascending=False)
      col_5     col_4     col_3     col_2     col_1
0  0.377775  0.573469  0.356124  0.329306  0.182444
1  0.370871  0.617122  0.876973  0.800108  0.166817
2  0.589542  0.059454  0.611156  0.308944  0.121286
3  0.512052  0.797248  0.515767  0.973306  0.365968
4  0.779764  0.486267  0.836222  0.415332  0.139206

Leave a Reply

Your email address will not be published. Required fields are marked *