Error Handling and Debugging Overview#

Up to now we did not care about error handling. If something went wrong, the Python interpreter stopped execution and printed some message. But Python provides techniques for more controlled error handling.

Error Handling#

Syntax Versus Runtime Errors#

The Python interpreter parses the whole souce code file before execution. In this phase the interpreter may encounter syntax errors. That is, the interpreter does not understand what we want him to do. The code does not look like Python code should look like. Syntax errors are easily recovered by the programmer.

The more serious types of errors are runtime errors (or semantic errors) which occur during program execution. Handling runtime errors is sometimes rather difficult.

Handling Runtime Errors#

The traditional way for handling runtime errors is to avoid runtime errors at all. All user input and all other sources of possible trouble get checked in advance by incorporating suitable if clauses in the code. This approach decreases readability of code, because the important lines are hidden between lots of error checking routines.

The more pythonic way of handling runtime errors are exceptions. Everytime the interpreter encounters some problem, like division by zero, it throws an exception. The programmer may catch the exception and handle it appropriately or the programmer may leave exception handling to the Python interpreter. In the latter case, the interpreter usually stops execution and prints a detailed error message.

Basic Exception Handling Syntax#

Here is the basic syntax for catching and handling exceptions:

try:
    # code which may cause troubles
except ExceptionName:
    # code for handling a certain exception caused by code in try block
except AnotherExceptionName:
    # code for handling a certain exception caused by code in try block
else:
    # code to execute after successfully finishing try block

The try block contains the code to be protected, that is, the code which might raise an exception. Then there is at least one except block. The code in the except block is only executed, if the specified exception has been raised. In this case, execution of the try block is stopped immediately and execution continues in the except block.

There can be several except blocks for handling different types of exceptions. Instead of an exception name also a tuple of names can be given to handle several different exceptions in one block.

The else block is executed after successfully finishing the try block, that is, if no exception occurred. Here is the right place for code which shall only be executed if no exception occurred, but for which no explicit exception handling shall be implemented.

Here is an example:

a = 0    # some number from somewhere (e.g., user input)

try:
    b = 1 / a
except ZeroDivisionError:
    print('Division by zero. Setting result to 1000.')
    b = 1000    # set b to some (reasonable) value
else:
    print('Everything okay.')

print('Result is {}.'.format(b))
Division by zero. Setting result to 1000.
Result is 1000.

Without using exception handling the interpreter would stop execution in the division line. By catching the exception we can avoid this automatic behavior and handle the problem in a way which does not prevent further program execution.

Note that exception names are not strings, but names of object types (classes). Thus, don’t use quotation marks.

print(type(ZeroDivisionError))
print(dir(ZeroDivisionError))
<class 'type'>
['__cause__', '__class__', '__context__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__suppress_context__', '__traceback__', 'args', 'with_traceback']

The Python documentation contains a list of built-in exceptions. There is a kind of tree structure in the set of all exceptions and we may define new exceptions if we need them to express errors specific to our program. These topics will be discussed in detail when delving deeper into object oriented programming.

Clean-Up#

Sometimes it’s necessary to do some clean-up operations like closing a file no matter an exception occurred or not while keeping the file open for reading and writing. For this purpose Python provides the finally keyword:

try:
    # code which may cause troubles
except ExceptionName:
    # code for handling a certain exception caused by code in try block
else:
    # code to execute after successfully finishing try block
finally:
    # code for clean-up operations

The finally block is executed after the try block (if there is no else block) or after the else block if no exception occured. If an exception occurred, then the finally block is executed after the corresponding except clause. If try or except clauses contain break, continue or return, then the finally block is executed before break, continue or return, respectively. If a finally block executed before return contains a return itself, then finally’s return is used and the original return is ignored.

Note

As long as a file is opened by our program the operating system blocks file access for other programs. Thus, we should close a file as soon as possible. Forgetting to close a file is not too bad because the OS will close it for use after program execution stopped. But for long running programs with only short file access at start-up a non-closed file may block access by other programs for hours or days. Thus, always, especially in case of exception handling, make sure that in each situation (with or without exception) files get closed properly by the program.

Objects With Predefined Clean-Up Actions#

Some object types, file objects for instance, include predefined clean-up actions. That is, for certain operations (e.g., opening a file) they define what should be done in a corresponding finally block (e.g., closing the file), if the operations would be placed in a try block.

To use this feature Python has the with keyword:

with open('some_file') as f:
    # do something with file object f

If the open function is successful, then the indented code block is executed. If open fails, an exception is raised. In both cases, with ensures, that proper clean-up (closing the file) takes place.

Objects which can be used with with are said to support the context management protocol. Such objects can also be defined by the programmer using dunder methods, see Python’s documentation for details.

The purpose of with is to make code more readable by avoiding too many try...except...finally blocks.

Logging and Debugging#

Up to now we considered syntax errors, which basically are typos in the code, and semantic errors, which are caused by unexpected user input or failed file access. But code may contain more involved semantic errors, which may be hard to identify. The process of finding and correcting semantic errors is known as debugging.

A simple approach to debugging is to print status information during program flow. For private scripts and a data scientist’s everyday use this suffices. For higher quality programs the Python standard library provides the logging package, which allows to redirect some of the status information to a log file. Logging basics are described in the basic logging tutorial.

If looking at log messages does not suffice, there are programs specialized to debugging your code. We do not cover this topic here. But if you are interested in you should have a look at The Python Debugger and at Debugging with Spyder.

Profiling#

Sometimes our code does what you want it to do, but it is too slow or consumes too much memory (out of memory error from the operating system). Then it’s time for profiling.

You may use the Spyder Profiler or import profiling functionality from suitable Python packages.

Profiling Execution Time#

The timeit module provides tools for measuring a Python script’s execution time in seconds.

import timeit

a = 1.23

code = """\
b = 4.56 * a
"""

timeit.timeit(stmt=code, number=1000000, globals=globals())
0.06846186006441712

This code snipped packs some code into the string code and passes it to the timeit function. This function executes the code number times to increase accuracy. The built-in function globals returns a list of all defined names. This list should be passed to the timeit function to provide access to all names.

Have a look at the The Python Profilers, too.

Note

If working in Jupyter you may use the %timeit and %%timeit magics instead of the timeit module, the former for timing one line of code (%timeit one_line_of_code), the latter for timing the whole code cell (place it in the cell’s first line).

Profiling Memory Consumption#

From data science view also memory consumption is of interest, because handling large data sets requires lots of memory. There are many ways to obtain memory information. A simple one is as follows (install module pympler first):

from pympler import asizeof

my_string = 'This is a string.'
my_int = 23

print(asizeof.asizeof(my_string))
print(asizeof.asizeof(my_int))
72
32

This gives the size of the memory allocated for some object. This number also includes the size of ‘subobjects’, that is, for example, all the objects referenced by a list object are included.