What is the purpose of the word 'self'?
What is the purpose of the self
word in Python? I understand it refers to the specific object created from that class, but I can't see why it explicitly needs to be added to every function as a parameter. To illustrate, in Ruby I can do this:
class myClass
def myFunc(name)
@name = name
end
end
Which I understand, quite easily. However in Python I need to include self
:
class myClass:
def myFunc(self, name):
self.name = name
Can anyone talk me through this? It is not something I've come across in my (admittedly limited) experience.
The reason you need to use self.
is because Python does not use the @
syntax to refer to instance attributes. Python decided to do methods in a way that makes the instance to which the method belongs be passed automatically, but not received automatically: the first parameter of methods is the instance the method is called on. That makes methods entirely the same as functions, and leaves the actual name to use up to you (although self
is the convention, and people will generally frown at you when you use something else.) self
is not special to the code, it's just another object.
Python could have done something else to distinguish normal names from attributes -- special syntax like Ruby has, or requiring declarations like C++ and Java do, or perhaps something yet more different -- but it didn't. Python's all for making things explicit, making it obvious what's what, and although it doesn't do it entirely everywhere, it does do it for instance attributes. That's why assigning to an instance attribute needs to know what instance to assign to, and that's why it needs self.
.
Let's say you have a class ClassA
which contains a method methodA
defined as:
def methodA(self, arg1, arg2):
# do something
and ObjectA
is an instance of this class.
Now when ObjectA.methodA(arg1, arg2)
is called, python internally converts it for you as:
ClassA.methodA(ObjectA, arg1, arg2)
The self
variable refers to the object itself.
Let’s take a simple vector class:
class Vector:
def __init__(self, x, y):
self.x = x
self.y = y
We want to have a method which calculates the length. What would it look like if we wanted to define it inside the class?
def length(self):
return math.sqrt(self.x ** 2 + self.y ** 2)
What should it look like when we were to define it as a global method/function?
def length_global(vector):
return math.sqrt(vector.x ** 2 + vector.y ** 2)
So the whole structure stays the same. How can me make use of this? If we assume for a moment that we hadn’t written a length
method for our Vector
class, we could do this:
Vector.length_new = length_global
v = Vector(3, 4)
print(v.length_new()) # 5.0
This works because the first parameter of length_global
, can be re-used as the self
parameter in length_new
. This would not be possible without an explicit self
.
Another way of understanding the need for the explicit self
is to see where Python adds some syntactical sugar. When you keep in mind, that basically, a call like
v_instance.length()
is internally transformed to
Vector.length(v_instance)
it is easy to see where the self
fits in. You don't actually write instance methods in Python; what you write is class methods which must take an instance as a first parameter. And therefore, you’ll have to place the instance parameter somewhere explicitly.
When objects are instantiated, the object itself is passed into the self parameter.
Because of this, the object’s data is bound to the object. Below is an example of how you might like to visualize what each object’s data might look. Notice how ‘self’ is replaced with the objects name. I'm not saying this example diagram below is wholly accurate but it hopefully with serve a purpose in visualizing the use of self.
The Object is passed into the self parameter so that the object can keep hold of its own data.
Although this may not be wholly accurate, think of the process of instantiating an object like this: When an object is made it uses the class as a template for its own data and methods. Without passing it's own name into the self parameter, the attributes and methods in the class would remain as a general template and would not be referenced to (belong to) the object. So by passing the object's name into the self parameter it means that if 100 objects are instantiated from the one class, they can all keep track of their own data and methods.
See the illustration below: