I’ll start with a trivial question: “What is a “dot operator”?
Here is an example:
hello = 'Hello world!'print(hello.upper())
# HELLO WORLD!
Well, this is surely an example of “Hello world”, but I can’t imagine someone starting to teach you Python exactly like this. Anyway, the “dot operator” is the “.” part ofhello.upper()
. Let’s try to give a more detailed example:
class Person:num_of_persons = 0
def __init__(self, name):
self.name = name
def shout(self):
print(f"Hey! I'm {self.name}")
p = Person('John')
p.shout()
# Hey I'm John.
p.num_of_persons
# 0
p.name
# 'John'
There are some places where the “dot operator” is used. To make it easier to see the big picture, let’s summarize the way you use it in two cases:
- Use it to access the attributes of an object or class,
- Use it to access functions defined in the class definition.
Obviously, we have all of this in our example, and it seems intuitive and expected. But there’s more to this than meets the eye! Take a closer look at this example:
p.shout
# <bound method Person.shout of <__main__.Person object at 0x1037d3a60>>id(p.shout)
# 4363645248
Person.shout
# <function __main__.Person.shout(self)>
id(Person.shout)
# 4364388816
Somehow, p.shout
does not refer to the same function as Person.shout
although it should. You’d at least hope so, right? AND p.shout
It’s not even a function! Let’s review the following example before we start discussing what is happening:
class Person:num_of_persons = 0
def __init__(self, name):
self.name = name
def shout(self):
print(f"Hey! I'm {self.name}.")
p = Person('John')
vars(p)
# {'name': 'John'}
def shout_v2(self):
print("Hey, what's up?")
p.shout_v2 = shout_v2
vars(p)
# {'name': 'John', 'shout_v2': <function __main__.shout_v2(self)>}
p.shout()
# Hey, I'm John.
p.shout_v2()
# TypeError: shout_v2() missing 1 required positional argument: 'self'
For those who do not know the vars
function, returns the dictionary containing the attributes of an instance. If you run vars(Person)
You’ll get a slightly different answer, but you’ll get the idea. There will be attributes with their values and variables containing definitions of class functions. Obviously there is a difference between an object that is an instance of a class and the class object itself, so there will be a difference in vars
function answer for these two.
Now, it is perfectly valid to additionally define a function after creating an object. this is the line p.shout_v2 = shout_v2
. This introduces another key-value pair into the instance dictionary. Apparently everything is fine and we will be able to function without problems, as if shout_v2
were specified in the class definition. But oh! Something is really wrong. We can’t call it the same way we did shout
method.
Astute readers should have already noticed how carefully I use the terms function and method. After all, there is also a difference in how Python prints them. Take a look at the examples above. shout
It is a method, shout_v2
It is a function. At least if we look at them from the perspective of the object. p
. If we look at them from the perspective of Person
class, shout
It is a function and shout_v2
does not exist. It is defined only in the object’s dictionary (namespace). So if you’re really going to rely on object-oriented paradigms and mechanisms like encapsulation, inheritance, abstraction, and polymorphism, you won’t define functions on objects, like p
It’s in our example. You will make sure to define functions in a class (body) definition.
So why are these two different and why does the error appear? Well, the quickest answer is because of how the “dot operator” works. The longer answer is that there is a mechanism behind the scenes that resolves the name (attribute) for you. This mechanism consists of __getattribute__
and __getattr__
sillier methods.
At first this will probably seem unintuitive and unnecessarily complicated, but bear with me. Essentially, there are two scenarios that can happen when you try to access an attribute of an object in Python: there is an attribute or there is not one. Simply. In both cases, __getattribute__
It’s called, or to make it easier for you, it’s always being called. This method:
- returns the calculated attribute value,
- explicitly calls
__getattr__
either - elevates
AttributeError
then__getattr__
is called by default.
If you want to intercept the mechanism that resolves attribute names, this is the place to hijack it. You just have to be careful, because it’s very easy to end up in an infinite loop or mess up the whole name resolution mechanism, especially in the object-oriented inheritance scenario. It’s not as simple as it seems.
If you want to handle cases where there is no attribute in the object’s dictionary, you can immediately implement the __getattr__
method. This is called when __getattribute__
cannot access the attribute name. If this method cannot find an attribute or fix a missing one after all, it generates a AttributeError
exception too. Here’s how you can play with these:
class Person:num_of_persons = 0
def __init__(self, name):
self.name = name
def shout(self):
print(f"Hey! I'm {self.name}.")
def __getattribute__(self, name):
print(f'getting the attribute name: {name}')
return super().__getattribute__(name)
def __getattr__(self, name):
print(f'this attribute doesn\'t exist: {name}')
raise AttributeError()
p = Person('John')
p.name
# getting the attribute name: name
# 'John'
p.name1
# getting the attribute name: name1
# this attribute doesn't exist: name1
#
# ... exception stack trace
# AttributeError:
It is very important to call super().__getattribute__(...)
in its implementation of __getattribute__
, and the reason, as I wrote before, is that there is a lot going on in the default Python implementation. And this is exactly where the “dot operator” gets its magic. Well, at least half of the magic is there. The other part is how a class object is created after interpreting the class definition.
The term I use here has a purpose. The class contains only functionsand we saw this in one of the first examples:
p.shout
# <bound method Person.shout of <__main__.Person object at 0x1037d3a60>>Person.shout
# <function __main__.Person.shout(self)>
When looked at from the perspective of the object, these are called methods. The process of transforming the function of a class into a method of an object is called bounding, and the result is what you see in the example above, a bound method. That makes? tiedGiven that? Well, once you have an instance of a class and start calling its methods, you are, in essence, passing the object reference to each of its methods. Remember the self
argument? So how does this happen and who does it?
Well, the first part occurs when the body of the class is interpreted. There are quite a few things that happen in this process, such as defining a class namespace, adding attribute values to it, defining (class) functions, and binding them to their names. Now, as these functions are defined, they are being wrapped in some way. Wrapped in an object called conceptually descriptor. This descriptor is allowing this change in the identification and behavior of class functions that we saw earlier. I’ll be sure to write a separate blog post about descriptors, but for now, know that this object is an instance of a class that implements a predefined set of dunder methods. This is also called Protocol. Once these are implemented, the objects of this class are said to continue the specific protocol and therefore behave in the expected manner. There is a difference between the data and without date descriptors. ancient implements __get__
, __set__
I __delete__
sillier methods. Subsequently, implement only the __get__
method. Anyway, every function in a class ends up wrapped in what is called without date descriptor.
Once you start searching for attributes using the “dot operator”, the __getattribute__
The method is called and the entire name resolution process begins. This process stops when the resolution is successful and looks something like this:
- return the data descriptor that has the desired name (class level), or
- return the instance attribute with the desired name (instance level), or
- return a descriptor with no data with the desired name (class level), or
- return class attribute with desired name (class level), or
- increase
AttributeError
which essentially calls for__getattr__
method.
My initial idea was to leave you a reference to the official documentation on how this mechanism is implemented, at least a Python mockup, for learning purposes, but I’ve decided to help you with that part as well. However, I strongly recommend that you read the full page of the official documentation.
So in the following code snippet, I will put some of the descriptions in the comments, to make it easier to read and understand the code. Here it is:
def object_getattribute(obj, name):
"Emulate PyObject_GenericGetAttr() in Objects/object.c"
# Create vanilla object for later use.
null = object()"""
obj is an object instantiated from our custom class. Here we try
to find the name of the class it was instantiated from.
"""
objtype = type(obj)
"""
name represents the name of the class function, instance attribute,
or any class attribute. Here, we try to find it and keep a
reference to it. MRO is short for Method Resolution Order, and it
has to do with class inheritance. Not really that important at
this point. Let's say that this mechanism optimally finds name
through all parent classes.
"""
cls_var = find_name_in_mro(objtype, name, null)
"""
Here we check if this class attribute is an object that has the
__get__ method implemented. If it does, it is a non-data
descriptor. This is important for further steps.
"""
descr_get = getattr(type(cls_var), '__get__', null)
"""
So now it's either our class attribute references a descriptor, in
which case we test to see if it is a data descriptor and we
return reference to the descriptor's __get__ method, or we go to
the next if code block.
"""
if descr_get is not null:
if (hasattr(type(cls_var), '__set__')
or hasattr(type(cls_var), '__delete__')):
return descr_get(cls_var, obj, objtype) # data descriptor
"""
In cases where the name doesn't reference a data descriptor, we
check to see if it references the variable in the object's
dictionary, and if so, we return its value.
"""
if hasattr(obj, '__dict__') and name in vars(obj):
return vars(obj)(name) # instance variable
"""
In cases where the name does not reference the variable in the
object's dictionary, we try to see if it references a non-data
descriptor and return a reference to it.
"""
if descr_get is not null:
return descr_get(cls_var, obj, objtype) # non-data descriptor
"""
In case name did not reference anything from above, we try to see
if it references a class attribute and return its value.
"""
if cls_var is not null:
return cls_var # class variable
"""
If name resolution was unsuccessful, we throw an AttriuteError
exception, and __getattr__ is being invoked.
"""
raise AttributeError(name)
Please note that this implementation is in Python for the purpose of documenting and describing the logic implemented in the __getattribute__
method. It is actually implemented in C. Just by looking at it, you can imagine that it is better not to waste time reimplementing everything. The best way is to try to do some of the resolution yourself and then resort to CPython’s implementation with return super().__getattribute__(name)
as shown in the example above.
The important thing here is that each class function (which is an object) is wrapped in a non-data descriptor (which is a function
class object), and this means that this container object has the __get__
Dunder method defined. What this dunder method does is return a new callable (think of it as a new function), where the first argument is the reference to the object on which we are performing the “dot operator”. I said to think of it as a new feature since it is a callable. In essence, it is another object called MethodType
. Check it out:
type(p.shout)
# getting the attribute name: shout
# methodtype(Person.shout)
# function
An interesting thing without a doubt is this. function
class. This is exactly the container object that defines the __get__
method. However, once we try to access it as a method shout
for “point operator”, __getattribute__
iterates through the list and stops at the third case (returns a non-data descriptor). This __get__
The method contains additional logic that takes the object reference and creates MethodType
With reference to function
and object.
Here is the mockup of the official documentation:
class Function:
...def __get__(self, obj, objtype=None):
if obj is None:
return self
return MethodType(self, obj)
Ignore the difference in class name. I have been using function
rather Function
to make it easier to grab, but I’ll use the Function
name from now on to follow the explanation in the official documentation.
Anyway, just looking at this mockup may be enough to understand how this works. function
The class fits the image, but let me add a couple of missing lines of code, which will probably make things clearer even more. I will add two more class functions in this example, namely:
class Function:
...def __init__(self, fun, *args, **kwargs):
...
self.fun = fun
def __get__(self, obj, objtype=None):
if obj is None:
return self
return MethodType(self, obj)
def __call__(self, *args, **kwargs):
...
return self.fun(*args, **kwargs)
Why did I add these features? Well now you can easily imagine how Function
The object plays its role in this entire method limiting scenario. it’s new Function
The object stores the original function as an attribute. This object is also callable which means we can call it as a function. In that case, it works the same as the function it wraps. Remember, everything in Python is an object, even functions. AND MethodType
‘wraps’ Function
object along with the reference to the object on which we are calling the method (in our case shout
).
As MethodType
do this? Well, it maintains these references and implements a callable protocol. Here is the mockup of the official documentation for the MethodType
class:
class MethodType:def __init__(self, func, obj):
self.__func__ = func
self.__self__ = obj
def __call__(self, *args, **kwargs):
func = self.__func__
obj = self.__self__
return func(obj, *args, **kwargs)
Again, for the sake of brevity, func
ends up referencing our initial class function (shout
), obj
reference instance (p
), and then we have arguments and keyword arguments that are passed. self
in it shout
The statement ends by referencing this ‘obj’, which is essentially p
in our example.
By the end, it should be clear why we make a distinction between functions and methods and how functions are linked once they are accessed through objects using the “dot operator.” If you think about it, we would be perfectly fine with invoking class functions like this:
class Person:num_of_persons = 0
def __init__(self, name):
self.name = name
def shout(self):
print(f"Hey! I'm {self.name}.")
p = Person('John')
Person.shout(p)
# Hey! I'm John.
However, this is not really the recommended way and is just plain ugly. Normally, you won’t have to do this in your code.
So before we wrap up, I want to go over a couple of attribute resolution examples just to make this easier to understand. Let’s use the example above and find out how the dot operator works.
p.name
"""
1. __getattribute__ is invoked with p and "name" arguments.2. objtype is Person.
3. descr_get is null because the Person class doesn't have
"name" in its dictionary (namespace).
4. Since there is no descr_get at all, we skip the first if block.
5. "name" does exist in the object's dictionary so we get the value.
"""
p.shout('Hey')
"""
Before we go into name resolution steps, keep in mind that
Person.shout is an instance of a function class. Essentially, it gets
wrapped in it. And this object is callable, so you can invoke it with
Person.shout(...). From a developer perspective, everything works just
as if it were defined in the class body. But in the background, it
most certainly is not.
1. __getattribute__ is invoked with p and "shout" arguments.
2. objtype is Person.
3. Person.shout is actually wrapped and is a non-data descriptor.
So this wrapper does have the __get__ method implemented, and it
gets referenced by descr_get.
4. The wrapper object is a non-data descriptor, so the first if block
is skipped.
5. "shout" doesn't exist in the object's dictionary because it is part
of class definition. Second if block is skipped.
6. "shout" is a non-data descriptor, and its __get__ method is returned
from the third if code block.
Now, here we tried accessing p.shout('Hey'), but what we did get is
p.shout.__get__ method. This one returns a MethodType object. Because
of this p.shout(...) works, but what ends up being called is an
instance of the MethodType class. This object is essentially a wrapper
around the `Function` wrapper, and it holds reference to the `Function`
wrapper and our object p. In the end, when you invoke p.shout('Hey'),
what ends up being invoked is `Function` wrapper with p object, and
'Hey' as one of the positional arguments.
"""
Person.shout(p)
"""
Before we go into name resolution steps, keep in mind that
Person.shout is an instance of a function class. Essentially, it gets
wrapped in it. And this object is callable, so you can invoke it with
Person.shout(...). From a developer perspective, everything works just
as if it were defined in the class body. But in the background, it
most certainly is not.
This part is the same. The following steps are different. Check
it out.
1. __getattribute__ is invoked with Person and "shout" arguments.
2. objtype is a type. This mechanism is described in my post on
metaclasses.
3. Person.shout is actually wrapped and is a non-data descriptor,
so this wrapper does have the __get__ method implemented, and it
gets referenced by descr_get.
4. The wrapper object is a non-data descriptor, so first if block is
skipped.
5. "shout" does exist in an object's dictionary because Person is
object after all. So the "shout" function is returned.
When Person.shout is invoked, what actually gets invoked is an instance
of the `Function` class, which is also callable and wrapper around the
original function defined in the class body. This way, the original
function gets called with all positional and keyword arguments.
"""
If reading this article in one sitting was not an easy task, don’t worry! The whole mechanism behind the “dot operator” is not something that is so easily understood. There are at least two reasons, one is how __getattribute__
does name resolution, and the other is how class functions conform to the interpretation of the class body. So be sure to go over the article a couple of times and play with the examples. Experimenting is really what prompted me to start a series called Advanced Python.
One more thing! If you like the way I explain things and there’s something advanced in the Python world you’d like to read about, say thanks!