np.reshape() - The Ultimate Guide in Python - Be on the Right Side of Change

Most of the function names in Python can be intuitively connected to the meaning of the function. The NumPy reshape() function is not an exception.

The reshape() function brings an array into another shape while keeping all the original data. I’ll demonstrate a couple of simple examples in the following video guide:

To summarize how np.reshape() works:

NumPy’s reshape() function takes an array to be reshaped as a first argument and the new shape tuple as a second argument. It returns a new view on the existing data—if possible—rather than creating a full copy of the original array. The returned array behaves like a new object: any change on one view won’t affect any other view.

Here are a couple of minimal examples:

>>> import numpy as np
>>> a = np.array([1, 2, 3, 4, 5, 6])
>>> np.reshape(a, (2,3))
array([[1, 2, 3],
       [4, 5, 6]])
>>> np.reshape(a, (3,2))
array([[1, 2],
       [3, 4],
       [5, 6]])
>>> np.reshape(a, (2,3))
array([[1, 2, 3],
       [4, 5, 6]])

Before we dive into more explanations of the shape and related characteristics, let’s quickly glance over the parameters and syntax next!

Parameters and Syntax

numpy.reshape(a, newshape, order='C')

Parameter	Type	Description
`a`	`array_like`	Array to be reshaped.
`newshape`	`int` or tuple of integers	The new shape and the original shape should be compatible. If the new shape is an integer `i`, the reshaped array will be a 1-D array with the length `i`. If the new shape is a tuple, each tuple element specifies the shape of one dimension. One shape dimension can be `-1` in which case the value is inferred from the array length and the remaining dimensions.
`order`	`{'C', 'F', 'A'}`, optional, default `'C'`	If specified, reads and places the elements of `a` using this index order. – `'C'`: read or write elements so that the last axis index changing fastest, back to the first axis index changing slowest. – `'F'`: read or write the elements so that the first index changing fastest, and the last index changing slowest. – ‘A’: read or write the elements in `'F'` order if a is Fortran contiguous in memory, and in `'C'` order otherwise.

source

Return Value: The output of the np.reshape() function is the reshaped ndarray as a new object if possible.

The Shape Property of a NumPy Array

Before focusing on the reshape() function, we need to understand some basic NumPy concepts.

Let’s assume that we have a large data set and counting the number of entries would be an impossible task. We could use the shape attribute to find the number of elements along each dimension of this array.

🛑 Attention: Be careful to remember that shape is an attribute and not a function. Attributes do not have parenthesis following them.

The shape attribute always returns a tuple that tells us the length of each dimension.

The one-dimensional (1D) array is a row vector and its shape is a single value iterable followed by a comma. One-dimensional arrays don’t have rows and columns, so the shape attribute returns a single value tuple.

Let’s have a look at an example:

import numpy as np

# 1D NumPy array
arr = np.arange(10)

print(arr)
# [0 1 2 3 4 5 6 7 8 9]

print(arr.shape)
# (10, )

The code snippet also uses the NumPy arange() function to create an initial array of subsequent values between 0 and 9.

💡 Reference: Please find a detailed discussion of the NumPy arange function in this Finxter blog article.

The shape attribute of a two-dimensional (2D) array, also called a matrix, gives us a tuple. The shape attribute returns the number of elements along each dimension, which is the number of rows and columns in the two-dimensional array.

# A two-dimensional NumPy array
import numpy as np

arr = np.array([[1,2,3,4,5], [5,4,3,2,1]])
print(arr.shape)
# (2, 5)

The following example is for the shape of three-dimensional (3D) arrays.

# A three-dimensional array
import numpy as np

arr = np.array([[[0, 11, 15, 16], 
                 [3, 7, 10, 34], 
                 [44, 99, 5, 67]],
                [[52, 8, 11, 13], 
                 [0, 4, 5, 6], 
                 [4, 4, 4, 4]]])
print(arr.shape)
# (2, 3, 4)

It takes some practice to understand the shape tuple for multidimensional arrays.

The dimensions represented by a tuple are read from the outside-in.

If you observe the brackets, the outmost bracket is a part of the basic syntax for the whole array. In the shape tuple 2 represents the second set of brackets. If you count them you will see that there are 2 elements in this dimension.

1st element [[0, 11, 15, 16], [3, 7, 10, 34], [44, 99, 5, 67]]

2nd element [[52, 8, 11, 13], [0, 4, 5, 6], [4, 4, 4, 4]]

Each element contains three more elements in the second dimension. If you think about nested lists, you can draw the analogy.

These elements are:

1st element [0, 11, 15, 16]

2nd element [3, 7, 10, 34]

3rd element [44, 99, 5, 67]

Finally, number 4 represents the number of elements in the third dimension. Those are the innermost elements. For example 0, 11, 15, and 16.

What is the reshape() Function in NumPy?

How do we relate NumPy’s shape attribute to the NumPy reshape() function?

Syntax

numpy.reshape(arr, newshape, order)

where

arr is the array we wish to reshape,
newshape is an integer for one-dimensional arrays and a tuple of integers multiple-dimensions, and
order is an optional argument that we won’t be getting into in this guide.

Reshaping an array can be useful when cleaning the data, or if there are some simple element-wise calculations that need to be performed.

One of the advantages that NumPy array has over Python list is the ability to perform vectorized operations easier. Moreover, reshaping arrays is common in machine learning.

Keep in mind that all the elements in the NumPy array must be of the same type.

Reshape NumPy Array 1D to 2D

Multiple Columns

Let’s say that we were measuring the outside temperature 3 days in a row, both in Celsius and Fahrenheit.

We recorded our measuring as a one-dimensional (1D) vector where all the even indices represent the temperature written in degrees celsius and all the odd indices represent the temperature written in degrees Fahrenheit.

temp = [10, 50, 15, 59, 5, 42]

There are 6 elements recorded in a single row.

To reshape the one-dimensional temp array to a two-dimensional array, we need to pass a tuple with a number of rows and columns to the reshape function.

Specifically, this tuple will consist of two numbers, let’s call them m and n, where the first number is the number of rows, and the second number is the number of columns.

💡 Note: m*n, the number of rows multiplied by the number of columns, must be the same as the number of elements in the original array. In this example, the number of elements in the original array is 6*1=6.

So, we only have two options for the two-dimensional array, 2 rows, and 3 columns or 3 columns and 2 rows.

import numpy as np

temp = [10, 50, 15, 59, 5, 42]

temp = np.reshape(temp, (3,2))
print(temp)

"""
[[10 50]
 [15 59]
 [ 5 42]]
"""

The data hasn’t changed; the same elements are in the same order. They are rearranged into two rows and three columns.

One Column

In the section about the shape attribute, we said that the shape of a one-dimensional array is given by a tuple that contains an integer followed by a comma. Then we explained that this vector doesn’t contain rows or columns.

What if we want this vector to have one column and as many rows as there are elements?

We can do this using reshape(). Even though there is only one column, this array will have two dimensions.

import numpy as np

arr = np.arange(10)
print(arr.shape)
# (10, )

#reshaping this vector
arr = np.reshape(arr, (arr.shape[0], 1))
print(arr.shape)
# (10, 1)

Reshape NumPy Array 2D to 1D

Let’s say we are collecting data from a college indoor track meets for the 200-meter dash for women.

During the first meet, we record three best times 23.09 seconds, 23.41 seconds, 24.01 seconds.
During the second meet, we record three best times 22.55 seconds, 23.05 seconds, and 23.09 seconds.

We record this in a two-dimensional array. But once we begin analyzing the data we need the results to be in a single row. We do the following to reshape the matrix:

import numpy as np

track = np.array([[23.09, 23.41, 24.01], [22.55, 23.05, 23.09]])
track = np.reshape(track, (6,))

print(track)
# [23.09 23.41 24.01 22.55 23.05 23.09]

print(track.shape)
# (6,)

print(track.ndim)
# 1

NumPy reshape(arr, -1)

Now, we are more likely to have a situation where we have thousands of entries in our data.

Let’s say that we have been collecting data from the college indoor track meets for the 200-meter dash for women over the past 3 years.

It was easy to count the number of entries when we had only six, but now we have thousands of entries. Instead of doing the hard task of counting the number of entries, we can pass -1 in the newshape argument.

We can show this in the following example:

import numpy as np

track = np.array([[23.09, 23.41, 24.01], [22.55, 23.05, 23.09]])
track = np.reshape(track, -1)

print(track)
# [23.09 23.41 24.01 22.55 23.05 23.09]

print(track.shape)
# (6,)

print(track.ndim)
# 1

Using -1 for newshape can be useful in multidimensional arrays. We will get back to it in the next section.

Reshape NumPy Array 3D to 2D

Sometimes the data that we collect will be messy and before we start analyzing it, we need to tidy it up.

Let’s say we have a three dimensional NumPy array that looks like this:

data = [[[ 0, 1],
         [ 2, 3]],
        [[ 4, 5],
         [ 6, 7]],
        [[ 8, 9],
         [10, 11]],
        [[12, 13],
         [14, 15]]]

When we examine the data closer, we can see that it would make more sense to have it stored as a two-dimensional matrix.

We can count the number of “pairs” that we want to have. One way to do this is:

data = np.array(data)
data = np.reshape(data, (8,2))
print(data)
"""
[[ 0  1]
 [ 2  3]
 [ 4  5]
 [ 6  7]
 [ 8  9]
 [10 11]
 [12 13]
 [14 15]]
"""

NumPy reshape(-1, m) and reshape(n, -1)

The above method of reshaping a three-dimensional (3D) array to a two-dimensional (2D) array works if we don’t have a large number of entries.

However, if we have thousands of entries, this can be tricky. In this case, we can use -1 for one dimension and if possible the data will be reshaped for us.

Using the example above:

import numpy as np

data = [[[ 0, 1],
 [ 2, 3]],
 [[ 4, 5],
 [ 6, 7]],
 [[ 8, 9],
 [10, 11]],
 [[12, 13],
 [14, 15]]]

data = np.array(data)
data = np.reshape(data, (-1,2))
print(data)
"""
[[ 0  1]
 [ 2  3]
 [ 4  5]
 [ 6  7]
 [ 8  9]
 [10 11]
 [12 13]
 [14 15]]
"""

np.reshape vs. np.newaxis

When we want to perform operations on arrays, they need to be compatible.

Element-wise, the size of the arrays needs to be equal in a dimension. But, they don’t have to have the same number of dimensions. If the sizes are not equal, NumPy raises an error.

When the arrays are different dimensions, one way to add a dimension is using the reshape() function.

Another way is to use np.newaxis expression.

The advantage of np.newaxis over reshape() is that you do not have to know the number of dimensions that should be added. The np.newaxis expression increases the dimension so that one-dimensional arrays become two-dimensional, two-dimensional arrays become three-dimensional, and so on…

The way it works it “slices” an array by adding a dimension. If we look at the original temperature array from earlier in the guide:

import numpy as np

temp = np.array([10, 50, 15, 59, 5, 42])

print(temp.shape)
# (6,)

temp = temp[np.newaxis, :]
print(temp.shape)
# (6,1)

print(temp)
# [[10 50 15 59  5 42]]

Attribution

This article is contributed by Finxter user Milica Cvetkovic. Milica is also a writer on Medium — check out her Medium Profile.

Where to Go From Here?

A thorough understanding of the NumPy basics is an important part of any data scientist’s education. NumPy is at the heart of many advanced machine learning and data science libraries such as Pandas, TensorFlow, and Scikit-learn.

If you struggle with the NumPy library — fear not! Become a NumPy professional in no time with our new coding textbook “Coffee Break NumPy”. It’s not only a thorough introduction into the NumPy library that will increase your value to the marketplace. It’s also fun to go through the large collection of code puzzles in the book.

Get your Coffee Break NumPy!

6 thoughts on “np.reshape() — The Ultimate Guide in Python”

hipohipo
July 12, 2019 at 2:48 pm
The example does not seem to work….
Python 3.7.3 (default, May 11 2019, 00:38:04)
[GCC 9.1.1 20190503 (Red Hat 9.1.1-1)] on linux
Type “help”, “copyright”, “credits” or “license” for more information.
>>> import numpy as np
>>> arr = np.array([1,2,3,4,5], [5,4,3,2,1])
Traceback (most recent call last):
File “”, line 1, in
TypeError: data type not understood
- Christian
  July 12, 2019 at 3:17 pm
  Thanks for the call-out! 🙂
  You’re right — I’ve fixed the issue.
Hardy
July 12, 2019 at 8:38 pm
Informative, it did open a knowledge gap.
- Christian
  July 15, 2019 at 8:01 am
  Awesome, thanks Hardy! 🙂
Richard Baynham
July 13, 2019 at 2:40 pm
Christain one of your stars has escaped
“elements in the original array is 61 which is 6. * ”
should read
“elements in the original array is 6*1 which is 6. “
- Christian
  July 15, 2019 at 8:03 am
  You’re right, Richard! I’ve fixed the issue. Thanks for your valuable feedback!

Comments are closed.