Delete rows from a dataframe

Last Updated on July 14, 2022 by Jay

Deleting rows is a common task in Excel, in this tutorial, we’ll learn a few techniques to delete rows from a pandas dataframe.

This article is part of the “Integrate Python with Excel” series, you can find the table of content here for easier navigation.

Prepare a dataframe

We’ll use the table below to demonstrate how to delete rows. Feel free to download the sample file to walk through the example.

import pandas as pd
df = pd.read_excel('users.xlsx', index_col=0)

>>> df
             Country      City Gender  Age
User Name                                 
Forrest Gump     USA  New York      M   50
Mary Jane     CANADA   Toronto      F   30
Harry Porter      UK    London      M   10
Jean Grey      CHINA  Shanghai      F   30
Jean Grey     CANADA  Montreal      F   30
Mary Jane     CANADA   Toronto      F   30

Notice the index_col=0 from the above code? If we leave that argument blank, the index will be a 0-based index. By specifying the index_col=0, we ask pandas to use the first column (User Name) as the index.

The `.drop()` method

Let’s delete the 3rd row (Harry Porter) from the dataframe. pandas provides a convenient method .drop() to delete rows. The important arguments for drop() method are listed below, note there are other arguments but we will only cover the following:

label: single label or a list of labels, these can be either row or column labels
axis: default is 0, which means index (i.e. rows). If set to 1, it refers to columns
inplace: this is to tell if pandas should overwrite the original dataframe

Delete rows by name

>>> df.drop('Harry Porter')

             Country      City Gender  Age
User Name                                 
Forrest Gump     USA  New York      M   50
Mary Jane     CANADA   Toronto      F   30
Jean Grey      CHINA  Shanghai      F   30
Jean Grey     CANADA  Montreal      F   30
Mary Jane     CANADA   Toronto      F   30

We skipped the axis argument, which means leaving it as the default value 0, or rows. So we are deleting a row with the index value “Harry Porter”. Also, note that the .drop() method also returns the resulting dataframe. Now is the interesting part, let’s see the dataframe df. It didn’t change! This is because we left out the inplace argument. Most of the pandas methods will return a copy of the dataframe, instead of modifying the original dataframe, unless we specify inplace=True.

>>> df
             Country      City Gender  Age
User Name                                 
Forrest Gump     USA  New York      M   50
Mary Jane     CANADA   Toronto      F   30
Harry Porter      UK    London      M   10
Jean Grey      CHINA  Shanghai      F   30
Jean Grey     CANADA  Montreal      F   30
Mary Jane     CANADA   Toronto      F   30

Use the following 2 methods if you want to overwrite the original dataframe df:

assign the resulting dataframe back to the original df
place inplace=True inside the drop() method

## The following 2 lines of code will give the same result 

df = df.drop('Harry Porter')
df.drop('Harry Porter', inplace=True)

Delete rows by position

We can also use the row (index) position to delete rows. Let’s delete rows 1 and 3, which are Forrest Gump and Harry Porter. In the resulting dataframe, we should see only Mary Jane and Jean Grey.

>>> df = df.drop(df.index [ [ 0,2 ] ])
>>> df
          Country      City Gender  Age
User Name                              
Mary Jane  CANADA   Toronto      F   30
Jean Grey   CHINA  Shanghai      F   30
Jean Grey  CANADA  Montreal      F   30
Mary Jane  CANADA   Toronto      F   30

The boolean index

Boolean index is basically a list of boolean values (True or False). We can use boolean index to filter rows easily, here we can also use it to delete rows conveniently. This time we’ll delete the rows with “Jean Grey” from the dataframe, and assign the result into a new dataframe

>>> df_new = df[df.index != "Jean Grey"]

>>> df_new
             Country      City Gender  Age
User Name                                 
Forrest Gump     USA  New York      M   50
Mary Jane     CANADA   Toronto      F   30
Harry Porter      UK    London      M   10
Mary Jane     CANADA   Toronto      F   30

Next chapter we’ll look at how to delete columns using a few different approaches.

Prepare a dataframe

The .drop() method

Delete rows by name

Delete rows by position

The boolean index

Leave a Reply Cancel reply

The `.drop()` method