Last Updated on March 4, 2022 by Jay
New pandas users might sometimes get this ValueError The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). when trying to filter a pandas dataframe. The message itself is actually quite ambiguous and doesn’t provide insight into what’s going on.
This happens when you try to use a logical operator to combine criteria. For example, in the following dataset, let’s get records where country = Canada or USA.
import numpy as np
import pandas as pd
df = pd.DataFrame({'date':pd.date_range(start='2021-12-01', periods=10, freq='MS'),
'country': ['USA','India','Germany','France','Canada','Netherland','UK','Singapore', 'Australia', 'Canada'],
'a': np.random.randint(10, size=10),
'b': np.random.randint(10, size=10)})
df
date country a b
0 2021-12-01 USA 8 7
1 2022-01-01 India 7 7
2 2022-02-01 Germany 7 4
3 2022-03-01 France 2 6
4 2022-04-01 Canada 3 9
5 2022-05-01 Netherland 3 4
6 2022-06-01 UK 2 7
7 2022-07-01 Singapore 3 9
8 2022-08-01 Australia 7 6
9 2022-09-01 Canada 1 4
df[(df['country'] == 'Canada') or (df['country'] == 'USA')]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_29580/14213871.py in <module>
----> 1 df[(df['country'] == 'Canada') or (df['country'] == 'USA')]
~\Desktop\PythonInOffice\pandas_profiling\venv\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
1535 @final
1536 def __nonzero__(self):
-> 1537 raise ValueError(
1538 f"The truth value of a {type(self).__name__} is ambiguous. "
1539 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Understanding The Error
Let’s try to decompose the error, we’ll check each criterion, which is essentially a boolean index. Each criterion works on its own. But when we combine them, we get the ValueError message.
a= df['country'] == 'Canada'
0 False
1 False
2 False
3 False
4 True
5 False
6 False
7 False
8 False
9 True
Name: country, dtype: bool
b= df['country'] == 'USA'
0 True
1 False
2 False
3 False
4 False
5 False
6 False
7 False
8 False
9 False
Name: country, dtype: bool
a or b
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_29580/4049256035.py in <module>
----> 1 a or b
~\Desktop\PythonInOffice\pandas_profiling\venv\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
1535 @final
1536 def __nonzero__(self):
-> 1537 raise ValueError(
1538 f"The truth value of a {type(self).__name__} is ambiguous. "
1539 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Normal Logical Operators – “The truth value of a Series is ambiguous”
Let’s use three shorter lists to check, what’s going on here?
- l_a or l_b is exactly same as l_a
- l_a or l_c is also exactly same as l_a. Even when l_a and l_c are not the same length!
- l_b or l_a is exactly same as l_b
Note this is the exact reason that pandas complains about “The truth value of a Series is ambiguous”. Pandas is smart enough to know we might get unexpected results so it throws an error.
l_a = [True, False, True, False]
l_b = [False, False, True, True]
l_c = l_c = [False, False]
l_a or l_b
[True, False, True, False]
l_b or l_a
[False, False, True, True]
l_a or l_c
[True, False, True, False]
Bitwise Operators
The short answer is that the normal logical operators, and, or, etc don’t compare by bitwise, in other words, items by items. To get the desired results for the item to item comparison, we need to use a bitwise operator.
a | b # "a or b"
0 True
1 False
2 False
3 False
4 True
5 False
6 False
7 False
8 False
9 True
Name: country, dtype: bool
Below is a list of bitwise operators:
Name | Bitwise Operator |
OR | | |
AND | & |
NOT | ~ |
XOR | ^ |
Once we switch to using the bitwise operator for filtering the dataframe, it works as expected:
df[(df['country'] == 'Canada') | (df['country'] == 'USA')]
date country a b
0 2021-12-01 USA 8 7
4 2022-04-01 Canada 3 9
9 2022-09-01 Canada 1 4
Additional Resources
How to Filter A Pandas Dataframe By A List of Values
9 Examples on How to Filter Dataframe with Pandas Query() Method
How to Filter Pandas Dataframe Using Boolean Index