Names and Objects#

In contrast to most other programming languages Python follows a very clean and simple approach to memory access and management. We now dive deeper into Python’s internal workings to come up with a new understanding of what we called variables in the Crash Course.

Variables in the Non-Pythonian World#

Most programming languages, C for instance, assign fixed names to memory locations. Such combinations of memory location and name are known as variables. Assigning a value to a variable then means, that the compiler or interpreter writes the value to the memory location to which the variable’s name belongs. There is a one-to-one correspondence between variable names and memory locations.

Consider the following C code:

int a;
int b;
a = 5;
b = a;

The first two lines tell the C compiler to reserve memory for two integer variables. The third line writes the value 5 to the location named a. The fourth line reads the value at the location named a and writes (copies) it to the location named b.

scheme with sequence of memory locations and corresponding names

Fig. 21 Memory is organized as a linear sequence of bytes. Used and currently unused bytes are managed by the operating system and by compiler. In C programs there is a one-to-one correspondence between variable names and memory locations.#

Variables in Python#

Python allows for multiple names per memory location and adds a layer of abstraction.

In Python everything is an object and objects are stored somewhere in memory. If we use integers in Python, then the integer value is not written directly to memory. Instead, additional information is added and the resulting more complex data structure is written to memory.

A newly created Python object does not have a name. Instead, Python internally assigns a unique number to each object, the object identifier or object ID for short. Thus, there is a one-to-one correspondence between object IDs and memory locations.

In addition to a list of all object IDs (and corresponding memory locations), Python maintains a list of names occuring in the source code. Each name refers to exactly one object. But different names may refer to the same object. In this sense Python does not know variables as described above, but only objects and names tied to objects.

Consider the following code:

a = 5
b = a

The first line creates an integer object containing the value 5 and then ties the name a to this object. The second line takes the object referenced by the name a and ties a second name b to it.

scheme with sequence of memory locations and corresponding object IDs and names

Fig. 22 In Python one memory location may have several names, but a unique object ID.#

Important

Assignment operation = in Python is not about writing something to memory. Instead, Python takes the existing object on the right-hand side of = and ties an additional name to it.

The object on the right-hand side may have existed before or it may be created by some operation specified by the code following =.

It’s also possible to create nameless objects. Simply omit name = before some object creation code.

Python has the built-in function id to get the ID of an object.

print(id(a))
print(id(b))
139672564187504
139672564187504

We see, that indeed a and b refer to the same object.

Clear distinction between names and objects in Python adds flexibility, but also requires much more care when accessing or modifying data in memory. We will have to discuss possible pitfalls resulting from this concept at several points later on.

Equality of Objects#

In Python we have objects and we have values contained in the objects. Thus, there are two fundamentally different questions which might be relevant for controlling program flow:

  • Do two names refer to the same object?

  • Do two objects (refered to by two names) contain the same value?

Consider the following code:

a = 1.23
b = 1.23

It creates two float objects both holding the value 1.23. To see that there are two objects we can look at the object IDs:

print(id(a))
print(id(b))
139672516871888
139672519436656

So the answer to the first question is ‘no’, but the answer to the second question is ‘yes’.

To compare equality of objects Python knows the is operator. To compare equality of values Python has the == operator. Both yield a boolean value as result.

print(a is b)
print(a == b)
False
True

Negations of both operators are is not and !=, respectively. Using is is equivalent to comparing object IDs:

print(id(a) == id(b))
False

Hint

Behavior of the is operator is hardwired in Python (use == on integer objects returned by id). But == simply calls the dunder method __eq__ of the left-hand side object. Thus, what happens during comparison depends on an object’s type. Writing your own classes (object types) you may implement the __eq__ method whenever appropriate. Without custom implementation Python uses a default one behaving similarly to is.

Local versus Global Names#

Names in Python have a scope, that is, a region of code where they are valid. Names defined outside functions and other structures are referred to as global names or global variables or simply globals. If a name is defined (that is, tied to some object) inside a function or some other structure, then the name is local. Local names are undefined outside the function or structure they are defined in.

def my_func():
    print(c)
    d = 456

c = 123
my_func()
print(d)
123
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [7], in <cell line: 7>()
      5 c = 123
      6 my_func()
----> 7 print(d)

NameError: name 'd' is not defined

If there is a local name which is also a global name, than it’s local version is used and the global one is left untouched.

def my_func():
    c = 456
    print(c)

c = 123
my_func()
print(c)
456
123

But how to change a global variable from inside a function? The global keyword tells the interpreter that a name appearing in a function refers to a global variable. The interpreter then uses the global variable instead of creating a new local variable.

def my_func():
    global c
    c = 456
    print(c)

c = 123
my_func()
print(c)
456
456

We cannot access a global variable from inside a function and then introduce a local variable with the same name. This leads to an error because each name appearing in an assignment in a function is considered local throughout the function. Consequently, accessing the value of a global variable before creating a corresponding local variable is interpreted as accessing an undefined name. The interpreter then complains about accessing a local variable before assignment.

def my_func():
    print(c)
    c = 456

c = 123
my_func()
print(c)
---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
Input In [10], in <cell line: 6>()
      3     c = 456
      5 c = 123
----> 6 my_func()
      7 print(c)

Input In [10], in my_func()
      1 def my_func():
----> 2     print(c)
      3     c = 456

UnboundLocalError: local variable 'c' referenced before assignment

Important

It’s considered bad practice to use lots of global variables. Global variables result in low readability of code. Exceptions prove the rule.