Image by author
If you are new to Python, you may have come across the terms “iteration” and “membership” and I wondered what they mean. These concepts are fundamental to understanding how Python handles collections of data, such as lists, tuples, and dictionaries. Python uses special Dunder methods to enable these capabilities.
But what exactly are Dunder methods? Dunder/Magic methods are special methods in Python that begin and end with a double underscore, hence the name “dunder.” They are used to implement various protocols and can be used to perform a wide range of tasks, such as checking membership, iterating over elements, and more. In this article, we will focus on two of the most important Dunder methods: __contains__ and __iter__. Then let's get started.
Understanding Pythonic Loops with Iter Method
Consider a basic implementation of a file directory using Python classes as follows:
class File:
def __init__(self, file_path: str) -> None:
self.file_path = file_path
class Directory:
def __init__(self, files: List(File)) -> None:
self._files = files
A simple code where the directory has an instance parameter that contains a list of File objects. Now, if we want to iterate over the directory object, we should be able to use a for loop as follows:
directory = Directory(
files=(File(f"file_{i}") for i in range(10))
)
for _file in directory:
print(_file)
We initialize a directory object with ten randomly named files and use a for loop to iterate over each element. Pretty simple, but wow! You receive an error message: TypeError: 'Directory' object is not iterable.
What went wrong? Well, our Directory class is not configured to loop through. In Python, for a class object to become iterable, you must implement the __iter__ Dunder method. All iterables in Python like List, Dictionaries and Set implement this functionality so we can use them in a loop.
So to make our Directory object iterable, we need to create an iterator. Think of an iterator as a helper that provides us with elements one by one when we request them. For example, when we loop through a list, the iterator object will provide us with the next element in each iteration until we reach the end of the loop. This is simply how an iterator is defined and implemented in Python.
In Python, an iterator must know how to provide the next element of a sequence. It does this using a method called __next__. When there are no more items to give, it generates a special signal called Stop iteration say, “Hey, we're done here.” In the case of an infinite iteration, we do not raise the Stop iteration exception.
Let's create an iterator class for our directory. It will take the list of files as an argument and implement the following method to provide us with the next file in the sequence. Keeps track of the current position using an index. The implementation looks like this:
class FileIterator:
def __init__(self, files: List(File)) -> None:
self.files = files
self._index = 0
def __next__(self):
if self._index >= len(self.files):
raise StopIteration
value = self.files(self._index)
self._index += 1
return value
We initialize an index value to 0 and accept files as an initialization argument. He __next__ The method checks if the index overflows. If so, generate a Stop iteration exception to signal the end of the iteration. Otherwise, it returns the file at the current index and moves to the next one by incrementing the index. This process continues until all files have been looped.
However, we're not done yet! We haven't implemented the iter method yet. The iter method must return an iterator object. Now that we have implemented the FileIterator class, we can finally move on to the iter method.
class Directory:
def __init__(self, files: List(File)) -> None:
self._files = files
def __iter__(self):
return FileIterator(self._files)
The iter method simply initializes a FileIterator object with its list of files and returns the iterator object. That's all it takes! With this implementation, we can now loop through our directory structure using Python loops. Let's see it in action:
directory = Directory(
files=(File(f"file_{i}") for i in range(10))
)
for _file in directory:
print(_file, end=", ")
# Output: file_0, file_1, file_2, file_3, file_4, file_5, file_6, file_7, file_8, file_9,
The for loop internally calls __iter__ method to display this result. Although this works, you might still be confused about the underlying working of the iterator in Python. To understand it better, let's use a while loop to implement the same mechanism manually.
directory = Directory(
files=(File(f"file_{i}") for i in range(10))
)
iterator = iter(directory)
while True:
try:
# Get the next item if available. Will raise StopIteration error if no item is left.
item = next(iterator)
print(item, end=', ')
except StopIteration as e:
break # Catch error and exit the while loop
# Output: file_0, file_1, file_2, file_3, file_4, file_5, file_6, file_7, file_8, file_9,
We call the iter function on the directory object to acquire the FileIterator. Then, we manually use the following operator to invoke the following dunder method on the FileIterator object. We handle the StopIteration exception to gracefully end the while loop once all elements have been exhausted. As expected, we got the same result as before!
Testing membership with the Contains method
It is a fairly common use case to check the existence of an element in a collection of objects. For example, in our example above, we will need to check if a file exists in a directory quite frequently. So Python simplifies it syntactically by using the “in” operator.
print(0 in (1,2,3,4,5)) # False
print(1 in (1,2,3,4,5)) # True
They are primarily used with conditional expressions and evaluations. But what if we try this with our directory example?
print("file_1" in directory) # False
print("file_12" in directory) # False
They both give us False, which is incorrect! Because? To verify membership, we want to implement the __contains__ Dunder method. When not implemented, Python falls back to using __iter__ method and evaluates each element with the == operator. In our case, it will iterate over each element and check if the “file_1” The string matches any File object in the list. Since we are comparing a string to custom file objects, none of the objects match, resulting in a false evaluation.
To solve this problem, we must implement the __contains__ dunder method in our Directory class.
class Directory:
def __init__(self, files: List(File)) -> None:
self._files = files
def __iter__(self):
return FileIterator(self._files)
def __contains__(self, item):
for _file in self._files:
# Check if file_path matches the item being checked
if item == _file.file_path:
return True
return False
Here, we change the functionality to iterate over each object and match the file_path of the File object with the string passed to the function. Now, if we run the same code to check for its existence, we will get the correct result!
directory = Directory(
files=(File(f"file_{i}") for i in range(10))
)
print("file_1" in directory) # True
print("file_12" in directory) # False
Ending
And that is! Using our simple directory structure example, we created a simple iterator and membership checker to understand the inner workings of Pythonic loops. We see these types of design decisions and implementations quite often in production-level code, and using this real-world example, we go over the comprehensive concepts behind the __iter__ and __contains__ methods. Keep practicing with these techniques to strengthen your understanding and become a more competent Python programmer!
Kanwal Mehreen Kanwal is a machine learning engineer and technical writer with a deep passion for data science and the intersection of ai with medicine. She is the co-author of the eBook “Maximize Productivity with ChatGPT.” As a Google Generation Scholar 2022 for APAC, she champions diversity and academic excellence. She is also recognized as a Teradata Diversity in tech Scholar, Mitacs Globalink Research Scholar, and Harvard WeCode Scholar. Kanwal is a passionate advocate for change and founded FEMCodes to empower women in STEM fields.