Recently I needed a way to infinitely loop over a list in Python. Traditionally, this is extremely easy to do with simple indexing if the size of the list is known in advance. For example, an approach could look something like this:
1l = [1, 2, 3] 2i = 0 3 4while True: 5 print(l[i]) 6 i += 1 7 if i == len(l): 8 i = 0
Eventually I settled on a an inbuilt approach using the
itertools module from the standard library. Consequently the code became a lot cleaner:
1import itertools 2l = [1, 2, 3] 3 4for n in itertools.cycle(l): 5 print(n)
But for the fun of it, I decided to try to re-implement
itertools.cycle myself, and in order to do that, I first have to understand how generators work in
python3. I already do, but I will explain it here, after which I will demonstrate the
Both iterators and generators have their own short sections in the offical python3 tutorial. This post is my own explanation of them.
Iterators are objects that define a
__next__ method that is called every time the next value of the iterable is desired. They can be iterated over, yielding their members one by one.
An object is said to be iterable if it defines an
__iter__ method that returns an above mentioned iterator.
It can be a little confusing. Think of it like this: strings are iterable (able to be iterated over) because their base class (
str) defines an
__iter__method that returns an iterator object which has a
For example, in the following code:
1for n in "abc": 2 print(n)
, behind the scenes,
"abc", which in turn calls
"123" is of type
str, a class that defines the
__iter__ method, that one is called, and an iterator object is returned.
for simply calls the
__next__ method on that interator object, passing its return value to you in the form of a variable, until
__next__ raises a
StopIteration exception, at which point the loop stops.
It actually calls
next()on the iterator object, which in turn calls its
next()builtin is handy because it provides another parameter that can be used to specify a value to be returned if the iterator object is already exahusted.
We can see this in action:
1> string = "abc" 2> iterator = string.__iter__() 3 4> print(iterator.__next__()) 5'a' 6> print(iterator.__next__()) 7'b' 8> print(iterator.__next__()) 9'c' 10 11> print(iterator.__next__()) 12Traceback (most recent call last): 13 File "<stdin>", line 1, in <module> 14StopIteration
As we can see, when we exhaust the iterator, a
StopIteration exception is raised.
Using the built-in
next() function we have a greater degree of control: we can designate a value to be returned in the case where
__next__ would exhaust the object (and raise an exception):
1> string = "abc" 2> iterator = string.__iter__() 3 4> print(next(iterator, 3)) 5'a' 6> print(next(iterator, 3)) 7'b' 8> print(next(iterator, 3)) 9'c' 10> print(next(iterator, 3)) 113 12> print(next(iterator, 3)) 133
Instead of directly calling
object.__iter__(), you should probably use the built-in
iter()method, as it has a little added functionality depending on the use case, but ultimately does the same thing (iterate over an object, that is).
itertools.cycle using iterators
Knowing all this, we can write a custom class that can infinitely iterate over another iterable:
1class InfiniteIterable: 2 def __init__(self, original): 3 self.original_iterable = original 4 self.len = len(original) 5 self.i = 0 6 def __iter__(self): 7 return self 8 def __next__(self): 9 if self.i > (self.len - 1): 10 self.i = 0 11 12 ret = self.original_iterable[self.i] 13 self.i += 1 14 15 return ret 16 17for n in InfiniteIterable("abc"): 18 print(n)
Since we never raise a
StopIteration exception from our
__next__ method, the iterator never halts!
Since our custom class is an iterable object in and of itself (i.e. it defines an
self! But if we didn’t want to implement iteration logic from within this class, we could make
__iter__return an instance of some other class.
Generator functions are regular Python functions that ease the process of creating iterators. They are written exactly as normal functions, with the added requirement of including at least one
The return value of a generator function is a generator, which is a kind of iterator.
1>>> def g(): 2... while True: 3... yield 3 4 5>>> '__iter__' in dir(g()) and '__next__' in dir(g()) 6True 7>>> g() 8<generator object g at 0x7fe49a2b35a0> 9>>> type(g()) 10<class 'generator'>
yield statement is encountered inside a generator function, it yields control back to the outside code, ‘returning’ the value that was the argument to the yield statement.
The next time the generator is re-entered (via
.send() method, or via a for-loop iteration), executuion resumes after the
The generator is exhausted when the associated generator function returns, and a
StopIteration exception is raised, just like with iterators. If the generator function returned with an associated value,
it is wrapped by the
StopIteration exception instance and can be subsequently accessed. This means that a
return value statement from within a generator function is
semantically equivalent to
raise StopIteration(value) - except that the exception cannot be caught from within the containing generator function.
InfiniteIterable using generators
1def InfiniteIterable(original_iterable): 2 i = 0 3 while True: 4 yield original_iterable[i] 5 i += 1 6 if i > (len(original_iterable) - 1): 7 i = 0 8 9for n in InfiniteIterable("abc"): 10 print(n)
This accomplishes the exact same task as manually writing the iterable does, but in a more readable manner. It is essentially just syntatic sugar.
generator.send() - bi-directional communication
Generators also implement a
.send() method, which allows for bi-directional communication with a ‘running’ generator and outside code.
next(generator) as well as
generator.__next__() is equivalent to calling
generator.send(None). That is to say, both
request the next value from the generator, while
.send also sends some data in.
The result of the yield expression in the generator function becomes the value of
.send’s first argument (or
next() was used instead).
yield from syntax, introduced in PEP 380 is used for delegating control to a subgenerator.
That is to say - the
yield from syntax iterates over the requested generator, and then yields the values back out as they come in.
yield from expression also has an accessible result that is present if the associated subgenerator returned with a value.
On the surface, it may seem that
yield from gen() is just a shorthand for the following for loop:
1for v in gen(): 2 yield v
While that is true for a number of simple use-cases - when the semantics of the iterator-generator methods
close is introduced,
it becomes apparent that the
yield from syntax performs a much more complicated job under the hood. This becomes evident after taking a look at the Formal Semantics of PEP 380.
Most notably, if the outer controlling generator is sent in a value from external code, it propagates that value and sends it to the generator
yield from expression. Alongside that, it also handles all the possible edge cases as related to exception handling inside generators and the