Efficiency Considerations#

When working with large arrays the most expensive operation (longest execution time) is copying arrays. Thus, we should avoid making copies of arrays. But there are some more things to consider when optimizing execution time and memory consumption.

import numpy as np

Don’t Use append in Loops#

Sometimes data comes in in chunks and we have to build a large array step by step. We could start with an empty array and append each new chunk of data. If incoming chunks are single numbers, code could look as follows:

%%timeit

a = np.array([], dtype=np.int64)

for k in range(0, 100):
    a = np.append(a, k)
396 µs ± 55.2 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Each call to append creates a new (larger) array and copies the existing one into the new one. In the end we made 100 expensive copy operations.

If we know the final size of our array in advance, then we should create an array of final size before filling it with data:

%%timeit

a = np.empty(100, dtype=np.int64)

for k in range(0, 100):
    a[k] = k
9.12 µs ± 165 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

We get a speed-up of factor 40, because repeated copying is avoided.

Append to Lists Instead of Arrays#

If data comes in in chunks and we do not know the final array’s size in advance, we should use a Python list for temporarily storing data. Appending to a Python list is cheap, because existing list data won’t be copied. Each list item has its own (more or less random) location in memory. If data is complete, we create a NumPy array of correct size and copy the list’s items to the array.

%%timeit

a = []
for k in range(0, 100):
    a.append(k)

b = np.array(a, dtype=np.int64)
11.5 µs ± 1.41 µs per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Speed-up compared to np.append is factor 35.

Use Multidimensional Indices#

For multidimensional arrays we have two indexing variants:

  • multidimensional indexing (e.g., a[0, 0, 0]),

  • repeated onedimensional indexing (e.g., a[0][0][0]).

The latter creates a lower-dimensional slice a[0], then indexes this slice, creating another slice, and so on. This process is less efficient than using multidimensional indices.

%%timeit

a = np.ones((10, 10, 10))
for k in range(0, 100):
    b = a[0][0][0]
37.3 µs ± 5.83 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%%timeit

a = np.ones((10, 10, 10))
for k in range(0, 100):
    b = a[0, 0, 0]
13.2 µs ± 344 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Speed-up is almost factor 3.

Remove Unused Arrays from Memory#

Working with large arrays we should free memory as soon as possible (use del). A non-obvious situation, where memory can be freed, is when using small views of large arrays. Consider the following code:

a = np.ones((100, 100))    # large array resulting from some computation
b = a[0, :]    # we only need the first row
del a

Here the large array remains in memory although we only need the first row. Because the view b is based on the array object a, del a only removes the name a, but garbage collection cannot remove the array object. More efficient code:

a = np.ones((100, 100))    # large array resulting from some computation
b = a[0, :].copy()    # we only need the first row, make a copy
del a    # remove large array from memory

Here only the first (copied) row remains in memory. The original large array will be removed from memory by Python’s garbage collection as soon as possible.