2

I am wondering how self works under-the-hood in Python classes.

My current limited understanding is that when a class is defined, e.g.

class Foo:
    def __init__(self, x: int):
        self.x = x

    def bar(self, a: float):
        print(self.x)
        print(a)

we get Foo in the current environment, and this object contains the methods attached to the class definition, in this case bar.

When we create an instance, e.g.

foo = Foo(1)

We now have both foo and Foo, the former containing the assigned fields, and the latter containing the methods.

So now when I call:

foo.bar(3.14)

a lookup to find bar is performed, first in foo.__dict__, and then, failing that, in Foo.__dict__.

Once bar is found in Foo.__dict__, it is called roughly like:

Foo.__dict__["bar"](foo, 3.14)

Is this an accurate explanation of what happens?

And if so, how is foo actually passed to the method? Is it a via a pointer to the instance?

I am a bit confused about how PyObject and PyTypeObject fit into this.

1
  • 3
    Your question seems to indicate that you get that, but just to be sure, self is not a keyword in python. It is just a naming convention. You could replace all occurrences of self in your code by meMyself or targetInstance, and it would work the same. Commented May 30 at 10:00

2 Answers 2

2

I think you've got the right idea. Your explanation is pretty much spot-on, but there's just something missing. So when you call foo.bar(3.14), Python does find bar in Foo.__dict__ like you said. But here's the interesting part: it doesn't just call the function directly. Python uses the descriptor protocol to automatically inject self.

## So when you write this for instance:
foo.bar(3.14)
## Python basically does this behind the scenes:
method = Foo.__dict__["bar"].__get__(foo, Foo)
method(3.14)

Generally, functions have a __get__ method that creates a "bound method" - basically a version of the function that already knows what self should be.

A quick view of this in action is seen below

class Foo:
    def bar(self, a):
        print(f"self: {self}, a: {a}")

foo = Foo()

# These do the same thing:
foo.bar(3.14)
Foo.bar(foo, 3.14)

# You can even grab the bound method:
bound = foo.bar
print(bound)  
bound(3.14)   
Sign up to request clarification or add additional context in comments.

5 Comments

Then concerning the c internals, you're right about the PyObject and PyTypeObject. In CPython, self is indeed a pointer to a PyObject struct containing the instance data and a reference to its type. The descriptor protocol is what makes this feel seamless in Python code. So yeah, your mental model is correct - Python just has some nice automation to make method calls feel natural
thank you for your help :) so just to confirm, self is inserted into the method via the descriptor protocol at the function call time, rather than when the class is defined ?
Hi Amira! Welcome to Stackoverflow, and congrats for your direct and correct answer! I believe you answer could be further improved by adding the exact search order for attributes in an instance and in a class (no problem if you don't want too - but then this could become a very complete answer) - you will find that in the docs - possibly where the descriptor protocol is explained - but it is something like: seaches class and superclasses for a data-descriptor, fallback to instance's __dict__, fallback to class/superclass for nondata descriptors (at this point it calls meth.__get__)
@FISR: that is correct - self is inserted at method call time - more specifically, at the . retrieval of the method's name: foo.bar (whithout the parenthesis to effect the call), ill run Foo.bar.__get__(foo, Foo) which returns a 'method instance" which is a callable to Foo.bar but with the self pre-annoted as a partial argument. So you can do x = foo.bar and in another line x(3.14) and it will work.
@FISR I wouldn't say "inserted" into the method, it is merely implicitly passed as the first argument via the descriptor protocol. IOW, foo.bar evaluates to a partially applied form of bar (an object of type BoundMethod), with foo applied as the first positional argument, you could think of foo.bar as evaluating to lambda *args, **kwargs: bar(foo, *args, **kwargs) (although it doesn't work like that under the hood,
1

Your analysis is mostly correct, but notice the following details:

>>> class Foo:
...   def bar(self, a:float): pass
... 
>>> a = Foo()
>>> b = Foo()
>>> C.bar
<function Foo.bar at 0x72467d71f240>
>>> a.bar
<bound method Foo.bar of <__main__.Foo object at 0x72467d8b9d60>>
>>> b.bar
<bound method Foo.bar of <__main__.Foo object at 0x72467d8b9d90>>
>>> a.bar.__self__
<__main__.Foo object at 0x72467d8b9d60>
>>> a
<__main__.Foo object at 0x72467d8b9d60>

As you see the class Foo.bar references something different from the instance a.bar respective b.bar: Namely instances reference a bound method compared the the unbound function of the class: The main difference here is that the bound method has an additional attribute __self__, which back-references the instance it is bound to. The rest is Python syntactic sugar as CPython translates a.bar(3.14) into Foo.bar(a.bar.__self__, 3.14) internally.

It gets a little bit more complicated as __dict__ is not enough; you have to understand __getattr__() and __getattribute__(), which might change what happens when attributes are looked up. I recommend reading https://snarky.ca/unravelling-attribute-access-in-python/

You also should read https://snarky.ca/unravelling-pythons-classes/ if you're interested in more details of classes, both by Brett Cannon of Python fame.

PyObject and PyObjectType are only relevant CPython – the implementation of the Python interpreter in C; there are other implementation in other languages like Jython in Java. Simplified the former implements a reference to an object while the later implements a meta class.

1 Comment

Hi Philipp - welcome to stackoverflow, and thanks for your detailed answer - If you want to further improve it, you might want to check Python's "descriptor protocol" documentation - as that will details the steps taken when calling a method, which are followed by the .__getattribute__ method. Best regards!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.