Delete columns from a pandas dataframe

Last Updated on July 14, 2022 by Jay

In this tutorial, we’ll learn how to drop/delete columns from a pandas dataframe. We are going to walk through three methods to achieve this. Depending on the situation, one method might be better than the other when used properly.

This article is part of the “Integrate Python with Excel” series, you can find the table of content here for easier navigation.

Preparing a dataframe

We’ll start off by creating a dataframe to demonstrate how to delete columns. Feel free to download this sample Excel file to follow along.

import pandas as pd
df = pd.read_excel('users.xlsx', index_col=0)

>>> df
             Country      City Gender  Age
User Name                                 
Forrest Gump     USA  New York      M   50
Mary Jane     CANADA   Toronto      F   30
Harry Porter      UK    London      M   10
Jean Grey      CHINA  Shanghai      F   30
Jean Grey     CANADA  Montreal      F   30
Mary Jane     CANADA   Toronto      F   30

Delete columns from dataframe with `.drop()` pandas method

Similar to deleting rows, we can also delete columns using .drop(). The only difference is that in the method we need to specify an argument axis=1. A few notes about this .drop() method.

To delete a single column: pass in the column name (string)
To delete multiple columns: pass in a list of the names for the columns to be deleted
If you want to overwrite the original dataframe, include inplace=True argument

df.drop('Country', axis=1)                           # delete a single column
df.drop(['Country', 'City'], axis=1)                 # delete multiple columns
df.drop(['Country', 'City'], axis=1, inplace=True)   # overwrite the original dataframe

Delete columns from pandas dataframe with `del` keyword

The del a keyword in Python, which can be used to delete an object. We can use it to delete a column from a dataframe.

Note that when using del, the object is deleted so it means the original dataframe is also updated to reflect the delete.

del df['Country']

>>> df
      User Name      City Gender  Age
0  Forrest Gump  New York      M   50
1     Mary Jane   Toronto      F   30
2  Harry Porter    London      M   10
3     Jean Grey  Shanghai      F   30
4     Jean Grey  Montreal      F   30
5     Mary Jane   Toronto      F   30

Delete columns from pandas dataframe with Re-assignment method

Aka the Square bracket method I coined. This is not a true delete method, but rather a re-assignment operation. However, the ending result is the same as a deletion.

Consider our original dataframe, which has 5 columns, namely:

User Name, Country, City, Gender, Age

Let’s say we want to delete Country and Age columns. Instead of delete, we create a new dataframe with only User Name, City and Gender in it, effectively “delete” the other two columns. Then, we assign the newly created dataframe to the original dataframe to complete the “delete operation”. Note the double square brackets in the code.

df = df[[ 'User Name', City', 'Gender' ]]

Which method to use??

You must be thinking “okay so you told me three methods, which one should I use??”. The answer is always: it depends. Below are some tips that I’ve been using to determine which method to use.

.drop()

Works best when we have many columns and you need to drop only a few. In this case, we only need to list the columns to drop.
However, we need to remember to include the inplace=True argument if we want to overwrite the original dataframe.

del

Works best when we need to drop only 1 or 2 columns. This method is the simplest and shortest code to write.
However, if we need to drop multiple columns, we need to use a loop, which is more cumbersome than the .drop() method.

re-assignment

Works best when the dataframe has only a few columns; or the dataframes has many columns but we are only keeping a few columns.
If we need to keep many columns, we’ll have to type all the columns names that we plan to keep, which could be a lot of typing.