17
Functools - The Power of Higher-Order Functions in Python
Python standard library includes many great modules that can help you make your code cleaner and simpler and functools
is definitely one of them. This module offers many useful higher order functions that act on or return other functions, which we can leverage to implement function caching, overloading, creating decorators and in general to make our code a bit more functional, so let's take a tour of it and see all the things it has to offer...
Let's start off with the simplest yet quite powerful functions of functools
module. These are caching functions (and also decorators) - lru_cache
, cache
and cached_property
. First of them - lru_cache
provides least recently used cache of function results or in other words - memoization of results:
from functools import lru_cache
import requests
@lru_cache(maxsize=32)
def get_with_cache(url):
try:
r = requests.get(url)
return r.text
except:
return "Not Found"
for url in ["https://google.com/",
"https://martinheinz.dev/",
"https://reddit.com/",
"https://google.com/",
"https://dev.to/martinheinz",
"https://google.com/"]:
get_with_cache(url)
print(get_with_cache.cache_info())
# CacheInfo(hits=2, misses=4, maxsize=32, currsize=4)
print(get_with_cache.cache_parameters())
# {'maxsize': 32, 'typed': False}
In this example we are doing GET requests and caching their results (up to 32 cached results) using @lru_cache
decorator. To see whether the caching really works we can inspect cache info of our function using cache_info
method, which shows number of cache hits and misses. The decorator also provides a clear_cache
and cache_parameters
methods for invalidating cached results and inspecting parameters respectively.
If you want to have a bit more granular caching, then you can also include optional typed=true
argument, which makes it so that arguments of different types are cached separately.
Another caching decorator in functools
is a function simply called cache
. It's a simple wrapper on top of the lru_cache
which omits the max_size
argument making it smaller and after as it doesn't need to evict old values.
There's one more decorator that you can use for caching and it's called cached_property
. This one - as you can probably guess - is used for caching results of class attributes. This is very useful if you have property that is expensive to compute while also being immutable.
from functools import cached_property
class Page:
@cached_property
def render(self, value):
# Do something with supplied value...
# Long computation that renders HTML page...
return html
This simple example shows how we could use cached property to - for example - cache rendered HTML page which would get returned to user over-and-over again. Same could be done for certain database queries or long mathematical computations.
Nice thing about cached_property
is that it runs only on lookups, therefore allowing us to modify the attribute. After the attribute is modified, the previously cached value won't be used, instead new value will be computed and cached. It's also possible to clear the cache, all we need to do is delete the attribute.
I would end this section with a word of caution for all of the above decorators - do not use them if your function has any side effects or if it creates mutable objects with each call, as those are not the types of functions that you want to have cached.
You probably already know that it's possible to implement comparison operators in Python such as <
, >=
or ==
using __lt__
, __gt__
or __eq__
. It can be quite annoying to implement every single one of __eq__
, __lt__
, __le__
, __gt__
, or __ge__
though. Luckily, functools
module includes @total_ordering
decorator that can help us with that - all we need to do, is implement __eq__
and one of the remaining methods and rest will be automatically provided by the decorator:
from functools import total_ordering
@total_ordering
class Number:
def __init__(self, value):
self.value = value
def __lt__(self, other):
return self.value < other.value
def __eq__(self, other):
return self.value == other.value
print(Number(20) > Number(3))
# True
print(Number(1) < Number(5))
# True
print(Number(15) >= Number(15))
# True
print(Number(10) <= Number(2))
# False
The above shows that even though we implemented only __eq__
and __lt__
we're able to use all of the rich comparison operations. The most obvious benefit of this is the convenience of not having to write all those extra magic methods, but probably more important is the reduction of code and it's improved readability.
Probably all of us were taught that function overloading isn't possible in Python, but there's actually an easy way to implement it using two functions infunctools
module - singledispatch
and/or singledispatchmethod
. These functions help us implement what we would call Multiple Dispatch algorithm, which is a way for dynamically-typed programming languages such Python to differentiate between types at runtime.
Considering that function overloading is pretty big topic on its own, I dedicated a separate article to Python's singledispatch
and singledispatchmethod
, so if you want to know more about this, then you can read more about it here.
We all work with various external libraries or frameworks, many of which provide functions and interfaces that require us to pass in callback functions - for example for asynchronous operations or for event listeners. That's nothing new, but what if we need to also pass in some arguments along with the callback function. That's where functools.partial
comes in handy - partial
can be used to freeze some (or all) of the function's arguments, creating new object with simplified function signature. Confusing? Let's look at some practical examples:
def output_result(result, log=None):
if log is not None:
log.debug(f"Result is: {result}")
def concat(a, b):
return a + b
import logging
from multiprocessing import Pool
from functools import partial
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger("default")
p = Pool()
p.apply_async(concat, ("Hello ", "World"), callback=partial(output_result, log=logger))
p.close()
p.join()
The above snippet demonstrates how we could use partial
to pass function (output_result
) along with its argument (log=logger
) as a callback function. In this case we use multiprocessing.apply_async
which asynchronously computes result of supplied function (concat
) and returns its result to the callback function. apply_async
will however always pass the result as a first argument and if we want to include any extra - as in this case log=logger
we have to use partial
.
This was fairly advanced use case, so a more basic example might be simply creating function that prints to stderr
instead of stdout
:
import sys
from functools import partial
print_stderr = partial(print, file=sys.stderr)
print_stderr("This goes to standard error output")
With this simple trick we created a new callable (function) that will always pass the file=sys.stderr
keyword argument to print
, allowing us to simplify our code by not having to specify the keyword argument every time.
And one last example for a good measure. We can also use partial
to utilize little known feature of iter
function - it's possible to create an iterator by passing callable and a sentinel value to iter
, which can be useful in following application:
from functools import partial
RECORD_SIZE = 64
# Read binary file...
with open("file.data", "rb") as file:
records = iter(partial(file.read, RECORD_SIZE), b'')
for r in records:
# Do something with the record...
Usually, when reading a file, we want to iterate over lines, but in case of binary data, we might want to iterate over fixed-sized records instead. This can be done by creating callable using partial
that reads specified chuck of data and passing it to iter
which then creates iterator out of it. This iterator then calls read
function until end of file is reached always taking only specified chuck of data (RECORD_SIZE
). Finally, when the end of file is reached sentinel value (b''
) is returned and iteration stops.
We already spoke about some decorators in the previous sections but not about decorators for creating more decorators, though. One such decorator is functools.wraps
, to understand why we need it, let's first take a look at the following example:
def decorator(func):
def actual_func(*args, **kwargs):
"""Inner function within decorator, which does the actual work"""
print(f"Before Calling {func.__name__}")
func(*args, **kwargs)
print(f"After Calling {func.__name__}")
return actual_func
@decorator
def greet(name):
"""Says hello to somebody"""
print(f"Hello, {name}!")
greet("Martin")
# Before Calling greet
# Hello, Martin!
# After Calling greet
This example shows how you could implement a simple decorator - we wrap the function that does the actual task (actual_func
) with outer decorator
function which becomes the decorator that we can then attach to other functions - as for example with greet
function here. When the greet
function is called you will see that it prints both the messages from actual_func
as well as its own. Everything looks fine, no problem here, right? But, what if we try the following:
print(greet.__name__)
# actual_func
print(greet.__doc__)
# Inner function within decorator, which does the actual work
When we inspect name and docstring of the decorated function we find that it was replaced by the values from inside the decorator function. That's not good - we can't have all our function names and docs overwritten every time we use some decorator. So, how do we solve this? - With functools.wraps
:
from functools import wraps
def decorator(func):
@wraps(func)
def actual_func(*args, **kwargs):
"""Inner function within decorator, which does the actual work"""
print(f"Before Calling {func.__name__}")
func(*args, **kwargs)
print(f"After Calling {func.__name__}")
return actual_func
@decorator
def greet(name):
"""Says hello to somebody"""
print(f"Hello, {name}!")
print(greet.__name__)
# greet
print(greet.__doc__)
# Says hello to somebody
The only job of wraps
function is to copy name, docstring, argument list, etc. to prevent them from being overwritten. And considering that wraps
is also a decorator we can just slap it onto our actual_func
and the problem is solved!
Last but not least in the functools
module is reduce
. You might know it from other languages as fold
(Haskell). What this function does, is take a iterable and reduce (or fold) all its values into single one. This has many different applications, so here are some of them:
from functools import reduce
import operator
def product(iterable):
return reduce(operator.mul, iterable, 1)
def factorial(n):
return reduce(operator.mul, range(1, n))
def sum(numbers): # Use `sum` function from standard library instead
return reduce(operator.add, numbers, 1)
def reverse(iterable):
return reduce(lambda x, y: y+x, iterable)
print(product([1, 2, 3]))
# 6
print(factorial(5))
# 24
print(sum([2, 6, 8, 3]))
# 20
print(reverse("hello"))
# olleh
As you can see from the above code, reduce
can simplify and oftentimes compress code into single line that would otherwise be much longer. With that said, overusing this function just for sake of shortening code, making "clever" or making it more functional is usually a bad idea as it gets ugly and unreadable really quickly, so in my opinion - use it sparingly.
Also considering that usage of reduce
generally produces one-liners it's ideal candidate for partial
:
product = partial(reduce, operator.mul)
print(product([1, 2, 3]))
# 6
And finally if you need not only the final reduced result, but also intermediate ones, then you can use accumulate
instead - function from another great module itertools
. This is how you can use it to compute running maximum:
from itertools import accumulate
data = [3, 4, 1, 3, 5, 6, 9, 0, 1]
print(list(accumulate(data, max)))
# [3, 4, 4, 4, 5, 6, 9, 9, 9]
As you could see here, functools
features a lot of useful functions and decorators that can make you life easier, but this module is really just a tip of an iceberg. As I mentioned in the beginning Python standard library includes many modules that can help you build better code, so besides functools
which we explored here, you might also want to checkout other modules such as operator
or itertools
(I wrote article about this one too, you can check it out here) or just go straight to Python module index and click on whatever catches your attention and I'm sure you will find something useful in there.
17