Internal Mechanism of Iterator, Iterable, and Generator in Python

ยท

6 min read

Internal Mechanism of Iterator, Iterable, and Generator in Python

In the realm of Python programming, the concepts of iteration, iterables, and generators serve as fundamental pillars underlying the language's versatility and power. While often encountered in coding endeavors, these concepts can sometimes pose challenges to grasp fully, particularly for those navigating their journey through Python's intricacies.

This article aims to demystify these fundamental constructs by delving into their internal mechanisms, offering clarity on their roles and importance within Python's ecosystem. By dissecting the nuances between iterators, iterables, and generators, readers will gain a profound understanding of how these elements shape the flow and efficiency of Python code.

Let's first look at what is Iterable?

Iterable - In Python

An iterable means, as the name suggests, any object we can iterate over, but there's a little more we should know about this. Generally, we know what objects can be iterable. For example, a list or tuple can be iterated over through a loop. But how do we know what can be iterable or what can't? For this, we just need to check if the object has the __iter__ method or not. Let me show you with an example.

Let's create a basic list in Python and iterate over it using a for loop. Also, we'll print all the properties and methods using the dir() method..

my_list = [1, 2, 3, 4]
for i in my_list:
    print(i)

print(dir(i))
๐Ÿ’ก
dir(): This function will return all the properties and methods, even built-in properties which are default for all object

Output

1
2
3
4
['__add__', '__class__', '__class_getitem__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__geta
ttribute__', '__getitem__', '__getstate__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__',
 '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__
', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']

[Process exited 0]

So, if you see the output and find a dunder method called __iter__, it means we can iterate over it; otherwise, no.

Now I guess, we are pretty much clear about iterables, Now let's move to Iterators.

Iterator - In Python

An iterator in Python is indeed iterable; it can be looped over, and it includes the dunder method __iter__ in its default properties. However, what sets an iterator apart from an iterable?

An iterator is an object with a state, allowing it to remember its position during iteration. While this terminology might seem dense, it becomes clearer with examples. Iterators have a state, enabling them to determine where they are in the iteration process. Moreover, iterators know how to retrieve their next value, accomplished through a dunder method called next.

Let's revisit our previous output. We observe that a list lacks a next method, and it also lacks any state information. Consequently, a list does not possess knowledge of its state or how to retrieve its next value, rendering it not an iterator.
Now, how can we create our iterator so that it can retain its state and retrieve the next value?

We achieve this by invoking the dunder method iter on our iterable, thereby transforming it into an iterator.

new = [1, 2, 3, 4]

i_new = iter(new) 
# OR
i_new = new.__iter__()

print(i_new)
print(dir(i_new))

Output

<list_iterator object at 0x100fc1c60>
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subcla
ss__', '__iter__', '__le__', '__length_hint__', '__lt__', '__ne__', '__new__', '__next__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__s
izeof__', '__str__', '__subclasshook__']

[Process exited 0]

Now, observing the properties, we notice the presence of both the dunder methods iter and next, confirming that it is indeed an iterator object. However, you may wonder why the iterator itself has an iter method, reminiscent of the case with a list. This occurrence arises because an iterator is also iterable. Therefore, running iter on an iterator simply returns the same object.

Let's delve into the next method with an example.

new = [1, 2, 3, 4]

i_new = iter(new)

print(i_new)

print(next(i_new))
print(next(i_new))
print(next(i_new))
print(next(i_new))
print(next(i_new))

Output

<list_iterator object at 0x1045b7fd0>
1
2
3
4
Traceback (most recent call last):
  File "/Users/vishnu.tiwari/Desktop/Python/Misc/iig.py", line 11, in <module>
    print(next(i_new))
          ^^^^^^^^^^^
StopIteration

[Process exited 1]

As mentioned, an iterator retains its state throughout the iteration process and knows its next value. In the previous example, every time we request the next value from our iterator object, it recalls its previous state and provides the subsequent value. Once it exhausts all available values, it raises the StopIteration exception. This exception indicates that the iterator has been depleted and has no more values to offer.

Now let's talk about the diamond topic which you won't find more articles about it which is the mechanism of a for loop in Python


Mechanism Of Foor Loop - Python

When we run a normal for loop, it knows how to handle the stop iteration exception and it doesn't show it to us. In the background a for loop is doing something like this, it's first getting an iterator of our original object and then it's getting the next values. until it hits a stop iteration exception.

Under the hood for loop itself uses a while loop and work like this.

# Normal For Loop 
for item in iterable:
    # do something with item
    pass

BTS

iterator = iter(iterable)
while True:
    try:
        item = next(iterator)
    except StopIteration:
        break
    # do something with item

Forward
An Iterator can only go forward, it can't go backward.

Generator

Some of you may have used generators, they're extremely useful for creating easy to read iterator, they look a lot like normal functions but instead of retuning a result, they instead yield a value and when they yield a value it keeps that state until the generator is run again and yields the next value so generator are iterators as well but the dunder iter and next methods are created automatically.

When a function contains one or more yield statements, Python treats it as a generator function. When you call a generator function, it returns a generator object without executing the function's code immediately. Instead, the function's code runs in response to the iterator's __next__() method being called.

Here's a simplified explanation of how a generator function works under the hood:

  • When you call a generator function, it returns a generator object.

  • Each time you call the generator object's __next__() method (implicitly via a loop or explicitly), the generator function executes until it encounters a yield statement.

  • When a yield statement is reached, the value specified with yield is returned, and the function's execution is paused. The generator object retains its state, including the local variables' values.

  • When the generator object's __next__() method is called again, execution resumes from where it left off, continuing until the next yield statement or the end of the function is reached.

  • When there are no more yield statements in the function or a return statement is encountered, a StopIteration exception is raised, indicating that the generator has exhausted its sequence.

Here's a conceptual example to illustrate how this works:

pythonCopy codedef my_generator():
    print("Start of generator")
    yield 1
    print("After first yield")
    yield 2
    print("After second yield")
    yield 3
    print("End of generator")

gen = my_generator()  # Create a generator object
print(next(gen))  # Output: 1
print(next(gen))  # Output: 2
print(next(gen))  # Output: 3
# print(next(gen))  # Raises StopIteration error

In this example, you can see how the generator function's code is executed up to the first yield statement when next(gen) is called for the first time. The function's execution is paused, and the value 1 is returned. Subsequent calls to next(gen) resume the function's execution from where it last yielded.

Summary

This article delves into the fundamental concepts of iteration, iterables, iterators, and generators in Python. It explains the differences between them, how to identify iterables, create iterators, and utilize generators. Additionally, it explores the mechanism of a for loop and provides insights into the workings of generators through examples.

Did you find this article valuable?

Support Vishnu Tiwari by becoming a sponsor. Any amount is appreciated!

ย