20
Understanding the Iterator Protocol in Python
This is an excerpt from our free Python course on Primer.
Iterables & Iterators are at the core of Python. Let's learn what are they and how to understand them better.
Some definitions first:
Iterables: An object capable of returning its member one at a time is called an iter-able.
To get a member from an iterable, one at a time, Python provides a built-in function iter()
. When we provide an iterable as an argument to the iter()
function ( which in returns call the __iter__()
dunder method on the object ), it returns something called iterator.
Let's check out in code:
>>> guests = {'Luffy', 'Zorro', 'Sanji'}
>>> iter(guests)
<set_iterator object at 0x7f7983d1dab0>
The above code listing provides the guests set to the iter() returns a set_iterator object. But what's an iterator?
- An iterator is an object representing a stream of data.
When you repeatedly call the built-in
next()
on the iterator object, it returns the stream's next items. - When no more items are available in the stream of data, the StopIteration exception is raised instead.
To understand this, let's continue our code from above.
>>> guests = {'Luffy', 'Zorro', 'Sanji'}
>>> guest_iterator = iter(guests)
>>> next(guest_iterator)
'Zorro' # Sets are unordered
>>> next(guest_iterator)
'Luffy'
>>> next(guest_iterator)
'Sanji'
>>> next(guest_iterator) # No more items left in the set
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
In the above code listing, we can see how the iterator returned from guests can provide one object at a time using built-in iter()
and next()
functions.
For an object to be iterable, it needs to have the dunder method, __iter__()
defined. It's the __iter__()
method, which returns an iterator from an iterable.
he built-in objects such as sequences
, sets
, and dictionaries
have the dunder method __iter__()
defined, which lets us use in for
loop.
Now that we have gained a bit of insight into how Python implements iterations let's try to write a while
loop to iterate over a set of objects.
>>> guests = {'Luffy', 'Zorro', 'Sanji'}
>>> guest_iterator = iter(guests) # Same as guests.__iter__()
>>> while True:
... try:
# Same as guest_iterator.__next__()
... guest = next(guest_iterator)
... print(guest)
... except StopIteration as e:
... break
Zorro
Luffy
Sanji
The while
loop we wrote is pretty close to what happens under the hood when we iterate on an iterable using a for
loop. Although, when using iterables with the for loop, we don't need to call the iter()
function or handle the StopIteration
error as the for statement does that automatically for us.
We can describe the iterator protocol as the following:
- First, obtain the iterator of the object using the
iter()
function or the dunder method<iterable>.__iter__()
. Call the built-in functionnext()
or the dunder method<iterator>.__next__()
on the iterator object. - Run the code block inside the
for
block - Repeat the next invocation until the iterator raises
StopIteration
Let's redefine our redefine the terms iterable and iterator.
Iterable: An object that allows the iteration is called an iter-able. The iterable is required to have an __iter__()
method defined that returns an iterator.
Iterator: The iterator is required to have both an __iter__()
method and a __next__()
method.
So far, we can summarise everything we have learned about iterables and iterators in the table below.
20