Copies and Views#

NumPy arrays may be very large. Thus, having too many copies of one and the same array (or subarrays) is expensive. NumPy implements a mechanism to avoid copying arrays if not necessary by sharing data between arrays. The programmer has to take care of which arrays share data und which arrays are independent from others.

import numpy as np

Views#

A view of a NumPy array is a usual ndarray object, that shares data with another array. The other array is called the base array of the view.

Views can be created with an array’s view method. The base object is accessible through a view’s base member variable.

a = np.ones((100, 100))
b = a.view()

print('id of a:', id(a))
print('id of b:', id(b))
print('base of a:', a.base)
print('id of base of b:', id(b.base))
id of a: 139825110801392
id of b: 139825110801776
base of a: None
id of base of b: 139825110801392

The view method is rarely called directly (might be used for type conversions without copying), but views frequently originate from calling shape manipulation functions like reshape or fliplr:

b = a.reshape(10, 1000)
c = np.fliplr(a)

print(a.shape)
print(b.shape, b.base is a)
print(c.shape, c.base is a)
(100, 100)
(10, 1000) True
(100, 100) True

Operations on views alter the base array’s (and other view’s) data:

b[0, 0] = 5

print(a[0, 0], b[0, 0], c[0, -1])
5.0 5.0 5.0

Important

Writing data to views modifies the base array! This is a common source of errors, which are very hard to track down. Always keep track of which of your arrays are views!

Slicing Creates Views#

Views may be smaller than the original array. Such views of subarrays originate from slicing operations:

a = np.ones((100, 100))
b = a[4:10, :]
c = a[5]

print(a.shape)
print(b.shape, b.base is a)
print(c.shape, c.base is a)
(100, 100)
(6, 100) True
(100,) True

Again modifying a view alters the original array:

b[1, 0] = 5

print(a[5, 0], b[1, 0], c[0])
5.0 5.0 5.0

Copies#

A NumPy array’s copy method yields a (deep) copy of an array.

a = np.ones((100, 100))
b = a.copy()

b[0, 0] = 5

print(a[0, 0], b[0, 0])
print(b.base)
1.0 5.0
None

Hint

NumPy arrays are mutable objects. Thus, assigning a new name to an array or passing an array to a function does not copy the array. Keeping this in mind is very important because functions you call in your code may alter you arrays. The other way round, writing functions other people might use, clearly indicate in the documentation if your function modifies arrays passed as parameters. If in doubt, use copy.