20
The Unknown Features of Python's Operator Module
At the first glance Python's operator
module might not seem very interesting. It includes many operator functions for arithmetic and binary operations and a couple of convenience and helper functions. They might not seem so useful, but with help of just a few of these functions you can make your code faster, more concise, more readable and more functional. So, in this article we will explore this great Python module and make the most out of the every function included in it.
The biggest part of the module consists of functions that wrap/emulate basic Python operators, such as +
, <<
or not
. It might not be immediately obvious why you would need or want to use any of these when you can just use the operator itself, so let's first talk about some of the use cases for all these functions.
First reason why you might want to use some of these in your code is if you need to pass operator to a function:
def apply(op, x, y):
return op(x, y)
from operator import mul
apply(mul, 3, 7)
# 21
Reason why we need to do this is, is that Python's operators (+
, -
, ...) are not functions, so you cannot pass them directly to functions. Instead, you can pass in the version from operator
module. You could easily implement wrapper function that does this for you, but no one wants to create function for each arithmetic operator, right? Also, as a bonus this allows for more functional style of programming.
You might also think, I don't need operator
module for this, I can just use lambda
expression!. Yes, but here comes the second reason why you should use this module. Functions in this module are faster than lambdas. You obviously won't notice that with single execution, but if you run it in loop enough times, then it's going to make a big difference:
python -m timeit "(lambda x,y: x + y)(12, 15)"
10000000 loops, best of 3: 0.072 usec per loop
python -m timeit -s "from operator import add" "add(12, 15)"
10000000 loops, best of 3: 0.0327 usec per loop
So if you're used to writing something like (lambda x,y: x + y)(12, 15)
, you might want to switch to operator.add(12, 15)
for a little performance boost.
Third and for me the most important reason to use operator
module is readability - this is more of a personal preference and if you use lambda
expressions all the time, then it might be more natural for you to use those, but in my opinion, it's in general more readable to use functions in operator
module rather than lambdas, for example consider the following:
(lambda x, y: x ^ y)(7, 10)
from operator import xor
xor(7, 10)
Clearly the second option is more readable.
Finally, unlike lambdas, operator
module functions are picklable, meaning that they can be saved and later restored. This might not seem very useful, but it's necessary for distributed and parallel computing, which requires the ability to pass functions between processes.
As I already mentioned this module has a function for every Python arithmetic, bitwise and truth operator as well as some extras. For the full list of mapping between functions and the actual operators, see table in docs.
Along with all the expected functions, this module also features their in-place versions that implement operations such as a += b
or a *= b
. If you want to use these you can just prefix the basic versions with i
, for example iadd
or imul
.
Finally, in operator
you will also find the dunder versions of all these functions, so for example __add__
or __mod__
. These are present there for legacy reasons, and the versions without underscores should be preferred.
Apart from all the actual operators, this module has some more features that can come in handy. One of them is little know length_hint
function, which can be used to get vague idea of length of an iterator:
from operator import length_hint
iterator = iter([2, 4, 12, 5, 18, 7])
length_hint(iterator)
# 6
iterator.__length_hint__()
# 6
I want to highlight the vague keyword here - don't rely on this value because it really is a hint and makes no guarantees of accuracy.
Another convenience function that we can grab from this module is countOf(a, b)
which returns number occurrences of b
in a
, for example:
from operator import countOf
countOf([1, 4, 7, 15, 7, 5, 4, 7], 7)
# 3
And last of these simple helpers is indexOf(a, b)
, which returns index of first occurrence of b
in a
:
from operator import indexOf
indexOf([1, 4, 7, 15, 7, 5, 4, 7], 7)
# 2
Apart from operator functions and couple of the above mentioned utility functions, operator
module also includes functions for working with higher-order functions. These are attrgetter
and itemgetter
which are most often used as key-function usually in conjunction with function such as sorted
or itertools.groupby
.
To see how they work and how you can use them in your code, let's look at a couple of examples.
Let's say we have a list of dictionaries, and we want to sort them by a common key. Here's how we can do it with itemgetter
:
rows = [
{"name": "John", "surname": "Doe", "id": 2},
{"name": "Andy", "surname": "Smith", "id": 1},
{"name": "Joseph", "surname": "Jones", "id": 3},
{"name": "Oliver", "surname": "Smith", "id": 4},
]
from operator import itemgetter
sorted_by_name = sorted(rows, key=itemgetter("surname", "name"))
# [{"name": "John", "surname": "Doe", "id": 2},
# {"name": "Joseph", "surname": "Jones", "id": 3},
# {"name": "Andy", "surname": "Smith", "id": 1},
# {"name": "Oliver", "surname": "Smith", "id": 4}]
min(rows, key=itemgetter("id"))
# {"name": "Andy", "surname": "Smith", "id": 1}
In this snippet we use sorted
function that accepts iterable and key function. This key function has to be a callable that takes single item from the iterable (rows
) and extracts the value used for sorting. In this case we pass in itemgetter
which creates the callable for us. We also give it dictionary keys from rows
which are then fed to object's __getitem__
and the results of the lookup are then used for sorting. As you probably noticed, we used both surname
and name
, this way we can simultaneously sort on multiple fields.
The last lines of the snippet also show another usage for itemgetter
, which is lookup of row with minimum value for ID field.
Next up is the attrgetter
function, which can be used for sorting in similar way as itemgetter
above. More specifically, we can use it to sort objects that don't have native comparison support:
class Order:
def __init__(self, order_id):
self.order_id = order_id
def __repr__(self):
return f"Order({self.order_id})"
orders = [Order(23), Order(6), Order(15) ,Order(11)]
from operator import attrgetter
sorted(orders, key=attrgetter("order_id"))
# [Order(6), Order(11), Order(15), Order(23)]
Here we use self.order_id
attribute to sort orders by their IDs.
Both of the above shown functions are very useful when combined with some functions from itertools
module, so let's see how we can use itemgetter
to group elements by its field:
orders = [
{"date": "07/10/2021", "id": 10001},
{"date": "07/10/2021", "id": 10002},
{"date": "07/12/2021", "id": 10003},
{"date": "07/15/2021", "id": 10004},
{"date": "07/15/2021", "id": 10005},
]
from operator import itemgetter
from itertools import groupby
orders.sort(key=itemgetter("date"))
for date, rows in groupby(orders, key=itemgetter("date")):
print(f"On {date}:")
for order in rows:
print(order)
print()
# On 07/10/2021:
# {"date": "07/10/2021", "id": 10001}
# {"date": "07/10/2021", "id": 10002}
# On 07/12/2021:
# {"date": "07/12/2021", "id": 10003}
# On 07/15/2021:
# {"date": "07/15/2021", "id": 10004}
# {"date": "07/15/2021", "id": 10005}
Here we have a list of rows (orders
) which we want to group by date
field. To do that, we first sort the array and then call groupby
to create groups of items with same date
value. If you're wondering why we needed to sort the array first, it's because groupby
function work by looking for consecutive records with same value, therefore all the records with same date need to be grouped together beforehand.
In the previous examples we worked with arrays of dictionaries, but these functions can be also applied to other iterables. We can for example use itemgetter
to sort dictionary by values, find index of minimum/maximum value in array or sort list of tuples based on some of their fields:
# Sort dict by value
from operator import itemgetter
products = {"Headphones": 55.90, "USB drive": 12.20, "Ethernet Cable": 8.12, "Smartwatch": 125.80}
sort_by_price = sorted(products.items(), key=itemgetter(1))
# [('Ethernet Cable', 8.12), ('USB drive', 12.2), ('Headphones', 55.9), ('Smartwatch', 125.8)]
# Find index of maximum value in array
prices = [55.90, 12.20, 8.12, 99.80, 18.30]
index, price = max(enumerate(prices), key=itemgetter(1))
# 3, 99.8
# Sort list of tuples based on their indices
names = [
("John", "Doe"),
("Andy", "Jones"),
("Joseph", "Smith"),
("Oliver", "Smith"),
]
sorted(names, key=itemgetter(1, 0))
# [("John", "Doe"), ("Andy", "Jones"), ("Joseph", "Smith"), ("Oliver", "Smith")]
Last function from operator
module that needs to be mentioned is methodcaller
. This function can be used to call a method on object using its name supplied as string:
from operator import methodcaller
methodcaller("rjust", 12, ".")("some text")
# "...some text"
column = ["data", "more data", "other value", "another row"]
[methodcaller("rjust", 12, ".")(value) for value in column]
# ["........data", "...more data", ".other value", ".another row"]
In the first example above we essentially use methodcaller
to call "some text".rjust(12, ".")
which right-justifies the string to length of 12 characters with .
as fill character.
Using this function makes more sense for example in situations where you have a string name of the desired method and want supply the same arguments to it over and over again, as in the second example above.
Another more practical example for usage of methodcaller
can be the following code. Here we feed lines of a text file to map
function and we also pass it our desired method - in this case strip
- which strips whitespaces from each of the lines. Additionally, we pass result of that to filter
which removes all the empty lines (empty lines are empty string which are falsy, so they get removed by filter).
from operator import methodcaller
with open(path) as file:
items = list(filter(None, map(methodcaller("strip"), file.read().splitlines())))
print(items)
In this article we took a quick tour of (in my opinion) an underrated operator
module. This shows that even small module with just a couple of functions can be very useful in you daily Python programming tasks. There are many more useful modules in Python's standard library, so I recommend just checking module index and diving in. You can also checkout my previous articles which explore some of these modules such as itertools or functools.
20