SQL Window Functions explained with example
2022-05-16
Understanding Bootstrapping approach vs. Traditional approaches in statistics
2022-05-22
Show all

Understanding np.newaxis and np.expand_dims in NumPy

9 mins read

To add new dimensions (increase dimensions) to the NumPy array ndarray, you can use np.newaxisnp.expand_dims(), and np.reshape() (or reshape() method of ndarray).

This post describes the following contents.

  • How to use np.newaxis
    • np.newaxis is None
    • Add new dimensions with np.newaxis
    • Control broadcasting with np.newaxis
  • Add a new dimension with np.expand_dims()
  • np.reshape()

How to use np.newaxis

np.newaxis is None

np.newaxis is an alias of None.

import numpy as np

print(np.newaxis is None)
# True

It’s just given an alias to make it easier to understand. If you replace np.newaxis in the sample code below with None, it works the same way.

Add new dimensions with np.newaxis

Using np.newaxis inside [] adds a new dimension of size 1 at that position.

a = np.arange(6).reshape(2, 3)
print(a)
# [[0 1 2]
#  [3 4 5]]

print(a.shape)
# (2, 3)

print(a[:, :, np.newaxis])
# [[[0]
#   [1]
#   [2]]
# 
#  [[3]
#   [4]
#   [5]]]

print(a[:, :, np.newaxis].shape)
# (2, 3, 1)
print(a[:, np.newaxis, :])
# [[[0 1 2]]
# 
#  [[3 4 5]]]

print(a[:, np.newaxis, :].shape)
# (2, 1, 3)
print(a[np.newaxis, :, :])
# [[[0 1 2]
#   [3 4 5]]]

print(a[np.newaxis, :, :].shape)
# (1, 2, 3)

The trailing : in [] can be omitted.

print(a[:, np.newaxis])
# [[[0 1 2]]
# 
#  [[3 4 5]]]

print(a[:, np.newaxis].shape)
# (2, 1, 3)
print(a[np.newaxis])
# [[[0 1 2]
#   [3 4 5]]]

print(a[np.newaxis].shape)
# (1, 2, 3)

Consecutive : can be replaced with .... If you want to add a new dimension to the last dimension of ndarray, which has many dimensions, it is easier to use .... See appendix for more details on Ellipsis.

print(a[..., np.newaxis])
# [[[0]
#   [1]
#   [2]]
# 
#  [[3]
#   [4]
#   [5]]]

print(a[..., np.newaxis].shape)
# (2, 3, 1)

You can use multiple np.newaxis at once. Multiple dimensions are added.

print(a[np.newaxis, :, np.newaxis, :, np.newaxis])
# [[[[[0]
#     [1]
#     [2]]]
# 
# 
#   [[[3]
#     [4]
#     [5]]]]]

print(a[np.newaxis, :, np.newaxis, :, np.newaxis].shape)
# (1, 2, 1, 3, 1)

Adding a dimension by np.newaxis returns a view of the original object. Because the original object and the view object share memory, changing one element modifies the other element.

a_newaxis = a[:, :, np.newaxis]

print(np.shares_memory(a, a_newaxis))
# True

Control broadcasting with np.newaxis

In the operation of two NumPy arrays ndarray, they are automatically reshaped into the same shape by broadcasting.

a = np.zeros(27, dtype=np.int).reshape(3, 3, 3)
print(a)
# [[[0 0 0]
#   [0 0 0]
#   [0 0 0]]
# 
#  [[0 0 0]
#   [0 0 0]
#   [0 0 0]]
# 
#  [[0 0 0]
#   [0 0 0]
#   [0 0 0]]]

print(a.shape)
# (3, 3, 3)

b = np.arange(9).reshape(3, 3)
print(b)
# [[0 1 2]
#  [3 4 5]
#  [6 7 8]]

print(b.shape)
# (3, 3)

print(a + b)
# [[[0 1 2]
#   [3 4 5]
#   [6 7 8]]
# 
#  [[0 1 2]
#   [3 4 5]
#   [6 7 8]]
# 
#  [[0 1 2]
#   [3 4 5]
#   [6 7 8]]]

In broadcast, a new dimension is added to the beginning of the array with a smaller number of dimensions.

If you add a new dimension to the beginning with np.newaxis, the result will be the same as if it was automatically converted by broadcasting.

print(b[np.newaxis, :, :].shape)
# (1, 3, 3)

print(a + b[np.newaxis, :, :])
# [[[0 1 2]
#   [3 4 5]
#   [6 7 8]]
# 
#  [[0 1 2]
#   [3 4 5]
#   [6 7 8]]
# 
#  [[0 1 2]
#   [3 4 5]
#   [6 7 8]]]

Changing the position to add will give different results.

print(b[:, np.newaxis, :].shape)
# (3, 1, 3)

print(a + b[:, np.newaxis, :])
# [[[0 1 2]
#   [0 1 2]
#   [0 1 2]]
# 
#  [[3 4 5]
#   [3 4 5]
#   [3 4 5]]
# 
#  [[6 7 8]
#   [6 7 8]
#   [6 7 8]]]
print(b[:, :, np.newaxis].shape)
# (3, 3, 1)

print(a + b[:, :, np.newaxis])
# [[[0 0 0]
#   [1 1 1]
#   [2 2 2]]
# 
#  [[3 3 3]
#   [4 4 4]
#   [5 5 5]]
# 
#  [[6 6 6]
#   [7 7 7]
#   [8 8 8]]]

For example, if you want to add or subtract arrays of a color image (shape: (height, width, color)) and monochromatic image (shape: (height, width)), it is impossible to broadcast the image as it is, but adding a new dimension at the end of the monochromatic image works well.

Add a new dimension with np.expand_dims()

You can also add a new dimension to ndarray using np.expand_dims().

Specify the original ndarray in the first argument a and the position to add the dimension in the second argument axis.

a = np.arange(6).reshape(2, 3)
print(a)
# [[0 1 2]
#  [3 4 5]]

print(np.expand_dims(a, 0))
# [[[0 1 2]
#   [3 4 5]]]

print(np.expand_dims(a, 0).shape)
# (1, 2, 3)

You can insert a new dimension at any position as follows:

print(np.expand_dims(a, 0).shape)
# (1, 2, 3)

print(np.expand_dims(a, 1).shape)
# (2, 1, 3)

print(np.expand_dims(a, 2).shape)
# (2, 3, 1)

A negative value can be specified for the second argument axis-1 corresponds to the last dimension, and you can specify the position from behind.

print(np.expand_dims(a, -1).shape)
# (2, 3, 1)

print(np.expand_dims(a, -2).shape)
# (2, 1, 3)

print(np.expand_dims(a, -3).shape)
# (1, 2, 3)

In NumPy 1.17, specifying a value such as axis > a.ndim or axis < -a.ndim - 1 in the second argument axis does not cause an error, and the dimension is added at the end or the beginning.

However, as the warning message says, it will cause an error in the future, so you should avoid it.

print(np.expand_dims(a, 3).shape)
# (2, 3, 1)
# 
# /usr/local/lib/python3.7/site-packages/ipykernel_launcher.py:1: DeprecationWarning: Both axis > a.ndim and axis < -a.ndim - 1 are deprecated and will raise an AxisError in the future.
#   """Entry point for launching an IPython kernel.

print(np.expand_dims(a, -4).shape)
# (2, 1, 3)
# 
# /usr/local/lib/python3.7/site-packages/ipykernel_launcher.py:1: DeprecationWarning: Both axis > a.ndim and axis < -a.ndim - 1 are deprecated and will raise an AxisError in the future.
#   """Entry point for launching an IPython kernel.

Only integer values can be specified in the second argument axis. It is impossible to add multiple dimensions at once by specifying multiple positions with a list or tuple.

# print(np.expand_dims(a, (0, 1)).shape)
# TypeError: '>' not supported between instances of 'tuple' and 'int'

As with np.newaxisnp.expand_dims() returns a view.

a_expand_dims = np.expand_dims(a, 0)

print(np.shares_memory(a, a_expand_dims))
# True

It is, of course, possible to control broadcasting by adding a new dimension with np.expand_dims() as in the example of np.newaxis above.

np.reshape()

You can reshape ndarray with np.reshape() or reshape() method of ndarray. See the following article for details.

If you specify a shape with a new dimension to reshape(), the result is, of course, the same as when using np.newaxis or np.expand_dims().

a = np.arange(6).reshape(2, 3)
print(a)
# [[0 1 2]
#  [3 4 5]]

print(a.shape)
# (2, 3)

print(a[np.newaxis])
# [[[0 1 2]
#   [3 4 5]]]

print(a[np.newaxis].shape)
# (1, 2, 3)

print(np.expand_dims(a, 0))
# [[[0 1 2]
#   [3 4 5]]]

print(np.expand_dims(a, 0).shape)
# (1, 2, 3)

print(a.reshape(1, 2, 3))
# [[[0 1 2]
#   [3 4 5]]]

print(a.reshape(1, 2, 3).shape)
# (1, 2, 3)

As you can see from the above example, using np.newaxis and np.expand_dims() has the advantage that you don’t have to explicitly specify the size of the original dimension.

Even with reshape(), if you want to add a dimension to the beginning or end, you do not have to explicitly specify the size by unpacking the original shape with *.

print(a.reshape(1, *a.shape))
# [[[0 1 2]
#   [3 4 5]]]

print(a.reshape(1, *a.shape).shape)
# (1, 2, 3)

Appendix:

Python built-in constant Ellipsis (...)

In Python3, Ellipsis is defined as a built-in constant.

print(Ellipsis)
# Ellipsis

Ellipsis can also be described as ... (3 dots).

print(...)
# Ellipsis

Ellipsis and ... are the same ellipsis object.

print(type(Ellipsis))
# <class 'ellipsis'>

print(type(...))
# <class 'ellipsis'>

print(Ellipsis is ...)
# True

As of Python 3.7.2, there is no particular use for Ellipsis (...) as the basic grammar of Python, but as shown below, NumPy has a convenient way to use Ellipsis (...).

Using Ellipsis (...) in NumPy

In NumPy, you can use Ellipsis (...) to omit intermediate dimensions when specifying elements or ranges with [].

Take the following four-dimensional array as an example.

import numpy as np

a = np.arange(120).reshape(2, 3, 4, 5)

print(a.shape)
# (2, 3, 4, 5)

For example, if you want to specify only the last dimension, you can use :.

print(a[:, :, :, 0])
# [[[  0   5  10  15]
#   [ 20  25  30  35]
#   [ 40  45  50  55]]
# 
#  [[ 60  65  70  75]
#   [ 80  85  90  95]
#   [100 105 110 115]]]

With ..., you can write:

print(a[..., 0])
# [[[  0   5  10  15]
#   [ 20  25  30  35]
#   [ 40  45  50  55]]
# 
#  [[ 60  65  70  75]
#   [ 80  85  90  95]
#   [100 105 110 115]]]

The same applies when you want to specify only the first and last dimensions. You can omit the dimension in the middle with ....

print(a[0, :, :, 0])
# [[ 0  5 10 15]
#  [20 25 30 35]
#  [40 45 50 55]]

print(a[0, ..., 0])
# [[ 0  5 10 15]
#  [20 25 30 35]
#  [40 45 50 55]]

You may use Ellipsis instead of ....

print(a[Ellipsis, 0])
# [[[  0   5  10  15]
#   [ 20  25  30  35]
#   [ 40  45  50  55]]
# 
#  [[ 60  65  70  75]
#   [ 80  85  90  95]
#   [100 105 110 115]]]

print(a[0, Ellipsis, 0])
# [[ 0  5 10 15]
#  [20 25 30 35]
#  [40 45 50 55]]

With :, the number of : must be matched to the number of dimensions, but you do not have to worry about it as ....

If there are two or more ..., it is unclear where to omit, so an error is raised.

# print(a[..., 0, ...])
# IndexError: an index can only have a single ellipsis ('...')

If : is repeated up to the last dimension, : can be omitted. There is no need to write ....

print(a[0, 0, :, :])
# [[ 0  1  2  3  4]
#  [ 5  6  7  8  9]
#  [10 11 12 13 14]
#  [15 16 17 18 19]]

print(a[0, 0])
# [[ 0  1  2  3  4]
#  [ 5  6  7  8  9]
#  [10 11 12 13 14]
#  [15 16 17 18 19]]

print(a[0, 0, ...])
# [[ 0  1  2  3  4]
#  [ 5  6  7  8  9]
#  [10 11 12 13 14]
#  [15 16 17 18 19]]

Source:

https://note.nkmk.me/en/python-numpy-newaxis/

Amir Masoud Sefidian
Amir Masoud Sefidian
Machine Learning Engineer

Comments are closed.