Vectorization#

NumPy is very fast if operations are applied to whole arrays instead of element-by-element. Thus, we should try to avoid iterating through array elements and processing single elements. This idea is known as vectorization.

import numpy as np

Example: Vectorization via Indexing#

Imagine we have a vector of length \(n\), where \(n\) is even. We would like to interchange each number at an even index with its successor. The result shall be stored in a new array.

Here is the code based on a loop:

def interchange_loop(a):
    result = np.empty_like(a)
    for k in range(0, int(a.size / 2)):
        result[2 * k] = a[2 * k + 1]
        result[2 * k + 1] = a[2 * k]
    return result

print(interchange_loop(np.array([1, 2, 3, 4, 5, 6, 7, 8])))
[2 1 4 3 6 5 8 7]

And here with vectorization:

def interchange_vectorized(a):
    result = np.empty_like(a)
    result[0::2] = a[1::2]
    result[1::2] = a[0::2]
    return result

print(interchange_vectorized(np.array([1, 2, 3, 4, 5, 6, 7, 8])))
[2 1 4 3 6 5 8 7]

Now let’s look at the execution times:

%%timeit
interchange_loop(np.zeros(1000))
178 µs ± 5.66 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%%timeit
interchange_vectorized(np.zeros(1000))
2.96 µs ± 266 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

The speed-up is of a factor of almost 75 for \(n=1000\) and becomes much better if \(n\) grows.

Example: Vectorization via Broadcasting#

Given two lists of numbers we want to have a two-dimensional array containing all products from these numbers.

Here is the code based on a loop:

def products_loop(a, b):
    result = np.empty((len(a), len(b)))
    for i in range(0, len(a)):
        for j in range(0, len(b)):
            result[i, j] = a[i] * b[j]
    return result

print(products_loop(range(1000), range(1000)))
[[0.00000e+00 0.00000e+00 0.00000e+00 ... 0.00000e+00 0.00000e+00
  0.00000e+00]
 [0.00000e+00 1.00000e+00 2.00000e+00 ... 9.97000e+02 9.98000e+02
  9.99000e+02]
 [0.00000e+00 2.00000e+00 4.00000e+00 ... 1.99400e+03 1.99600e+03
  1.99800e+03]
 ...
 [0.00000e+00 9.97000e+02 1.99400e+03 ... 9.94009e+05 9.95006e+05
  9.96003e+05]
 [0.00000e+00 9.98000e+02 1.99600e+03 ... 9.95006e+05 9.96004e+05
  9.97002e+05]
 [0.00000e+00 9.99000e+02 1.99800e+03 ... 9.96003e+05 9.97002e+05
  9.98001e+05]]

And here with vectorization:

def products_vectorized(a, b):
    result = np.array([a]).T * np.array([b])
    return result

print(products_vectorized(range(1000), range(1000)))
[[     0      0      0 ...      0      0      0]
 [     0      1      2 ...    997    998    999]
 [     0      2      4 ...   1994   1996   1998]
 ...
 [     0    997   1994 ... 994009 995006 996003]
 [     0    998   1996 ... 995006 996004 997002]
 [     0    999   1998 ... 996003 997002 998001]]

Hint

The T member variable of a NumPy array provides the transposed array. It’s not a copy (expensive) but a view (cheap). For detail see Linear Algebra Functions.

Execution times:

%%timeit
products_loop(range(1000), range(1000))
248 ms ± 3.36 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%%timeit
products_vectorized(range(1000), range(1000))
1.25 ms ± 11.4 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Speed-up is factor 200 for lists of lenth 1000.

Important#

Whenever you see a loop in numerical routines, spend some time to vectorize it. Almost always that’s possible. Often vectorization also increases readability of the code.