Last Updated on July 14, 2022 by Jay
I realize this is an important topic since we’ll be using these techniques a lot in pandas. Python list indexing and slicing refers to how to select and filter data from a list or an array-like object. The techniques discussed here also work on tuples.
This tutorial is part of the “Integrate Python with Excel” series, you can find the table of content here for easier navigation.
List vs tuple
If you are familiar with VBA or other programming languages, Python list & tuple are basically arrays. A list or tuple can contain any type of object/data. The distinction between them is that a list is mutable (can be modified), and a tuple is immutable (can not be modified).
Fun fact: a String object is actually a tuple! Which means you can not modify individual letters within a string object.
Prepare a list
We’ll use a simple list to demonstrate the techniques. In this article we don’t need any libraries, just pure Python list manipulations. Note that Python uses a zero-based indexing, meaning that the index starts from 0 instead of 1.
li = ['A','B','C','D','E','F','G','H','I','J'] li2 = ['A','B','C','D','E']
Basic Python list manipulations
Python list has only a handful of built-in functions, we’ll look at several of them:
append()– add an item to the list
extend()– add items to the list, difference between append and extend: append adds 1 item, extend adds a list
remove()– remove an item from the list
pop()– remove the last item from the list, and return it
insert()– insert an item into the list
index()– returns the index of an element
Select an item
>>> li ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'] >>> li # Select the 1st item 'A' >>> li # Select the 10th item 'J'
Access items from the end of a list
>>> li[-1] # Select the last item 'J' >>> li[-3] # Select the 3rd last item 'H'
Slicing / Select various items
Python list uses the notation
[ n : m ] to indicate a “slice”, which literally means a number of consecutive items, from the nth to the mth item. The Python list slicing has a weird notation though: the starting item uses the zero-based index, but the ending item uses the one-based index. See the code below and the visual aid for reference.
>>> li[1:5] # Selects the 2nd through 5th items ['B', 'C', 'D', 'E']
The above slice starts from the 2nd element (1), ends at the 5th element (5), which are B and E, respectively.
In the following cases, we can omit the starting or ending index:
- Starting from the beginning:
li[ : 5]this returns the first 5 items: [‘A’, ‘B’, ‘C’, ‘D’, ‘E’]
- Ending with the last item:
li[ 5 : ]this returns the last 5 items: [‘F’, ‘G’, ‘H’, ‘I’, ‘J’]
- We can also do
li [ : ], but this is the full list, which means say as just
Reverse the list
There are two methods to reverse a list. One is a built-in method and the other is by slicing. Note:
.reverse()method overwrites the original list
- slicing does not overwrite the original list, because it returns a “slice” of the list
>>> li.reverse() >>> li ['J', 'I', 'H', 'G', 'F', 'E', 'D', 'C', 'B', 'A'] >>> li[::-1] ['J', 'I', 'H', 'G', 'F', 'E', 'D', 'C', 'B', 'A']
Combine various lists
There are two ways to combine various lists: .extend() method or just the + sign.
extend()method combines two lists and then assign the resulting list to the original.
- + sign also combines two (or more) lists but does not overwrite the original list.
li = ['A','B','C','D','E','F','G','H','I','J'] li2 = ['A','B','C','D','E'] >>> li+li2 ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'A', 'B', 'C', 'D', 'E'] >>> li # original list remains unchanged ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'] >>> li.extend(li2) >>> li # original list got overwritten ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'A', 'B', 'C', 'D', 'E']
Remove duplicate values from a list
A list can contain any type of data items, including duplicates. There are several ways to remove duplicated value, but I’m going to introduce a more Pythonic way. To do this, we’ll need to know about another data structure – sets. A set is an unordered collection of distinct items. We use a pair of curly brackets to denote sets. The idea is to first convert a list into a set (therefore keeping the distinct items), then convert the set back to a list. See the following example:
>>> li ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'A', 'B', 'C', 'D', 'E'] set1 = set(li) # convert the list to set list(set1) # convert the set back to list >>> li = list(set(li)) >>> li ['H', 'E', 'A', 'J', 'D', 'F', 'I', 'C', 'G', 'B']