Selective Indexing

How to Conditionally Select Elements in a Numpy Array?

You have a Numpy array. You want to select specific elements from the array. But you can neither use slicing, nor indexing. What can you do?

In this short tutorial, I show you how to select specific Numpy array elements via boolean matrices. You can even use conditions to select elements that fall in a certain range:

import numpy as np


A = np.array([[1,2,3],
             [4,5,6],
             [1,2,3]])

print(A[A > 3])
# [4 5 6]

Plus, you are going to learn three critical concepts of Python’s Numpy library: the arange() function, the reshape() function, and selective indexing.

Let’s start with a small code puzzle that demonstrates these three concepts:

import numpy as np


a = np.arange(9)
a = a.reshape((3,3))

print(a)
# [[0 1 2]
# [3 4 5]
# [6 7 8]]

b = np.array(
    [[ True, False, False],
     [ False, True, False],
     [ False, False, True]])
print(a[b])
# Flattened array with selected values from a
# [0 4 8]

1. The Numpy Arange() Function

The numpy function np.arange([start,] stop[, step]) creates a new numpy array with evenly spaced numbers between start (inclusive) and stop (exclusive) with the given step size.

For example, np.arange(1, 6, 2) creates the numpy array [1, 3, 5]. You can also skip the start and step arguments (default values are start=0 and step=1). If you want to master the numpy arange function, read this introductory Numpy article.

2. The Numpy Reshape() Function

What do you do if you fall out of shape? You reshape.

The reshape(shape) function takes a shape tuple as an argument. In yesterday’s email, I have shown you what the shape of a numpy array means exactly. Here is a small reminder: the shape object is a tuple; each tuple value defines the number of data values of a single dimension.

The reshape(shape) function takes an existing numpy array and brings it in the new form as specified by the shape argument. Think of it this way: the reshape function goes over a multi-dimensional numpy array, creates a new numpy array, and fills it as it reads the original data values.

3. Selective Indexing in Numpy

Selective indexing: Instead of defining the slice to carve out a sequence of elements from an axis, you can select an arbitrary combination of elements from the numpy array.

How? Simply specify a boolean array with exactly the same shape. If the boolean value at position (i,j) is True, the element will be selected, otherwise not. As simple as that.

The matrix b with shape (3,3) is a parameter of a’s indexing scheme.

Beautiful, isn’t it?

Let me highlight an important detail. In the example, you select an arbitrary number of elements from different axes. How is the Python interpreter supposed to decide about the final shape? For example, you may select four rows for column 0 but only 2 rows for column 1 – what’s the shape here? There is only one solution: the result of this operation has to be a one-dimensional numpy array.

That’s it for today. Congratulations if you could follow the numpy code explanations! In this case, you can already begin working as a Python freelancer. There are endless opportunities for Python freelancers in the data science space!

Join my Free Python Course to learn about how to earn money with Python.


Leave a Comment