NumPy all() – A Simple Guide with Video

Rate this post


numpy.all(a, axis=None, out=None, keepdims=<no value>, *, where=<no value>)
aarray_likeInput array
axisNone, int, or tuple of intOptional. One axis or multiple axes along which logical AND should be performed. Per default, it computes logical AND on the flat array. If this is a tuple of integers, calculates logical AND along the specified multiple axes.
outndarrayOptional. If specified, the result is written into this array rather than computing a new one. To work, this must have the same shape as the determined output array.
keepdimsboolOptional. If True, the reduced axes remain dimensions with size one rather than getting skipped.
wherearray_like of boolOptional. Additional elements to be checked for all().

Return Value: A new boolean or array unless out is defined.

Documentation (source)

Simple Example Without Optional Arguments

The most simple and most frequent example is to check if all values in an array evaluate to True.

>>> import numpy as np
>>> np.all([[1, 2, 3], [4, 5, 6]])

When at least one element evaluates to False (such as the integer 0), the whole operation evaluates to False:

>>> import numpy as np
>>> np.all(np.arange(100))

In the previous example, we used the np.arange() function to create a sequence of consecutive integers 0, 1, …, 99.

NumPy all() Along Axes

Here’s an example of computing the function along the first axis 0, collapsing the two columns using all():

>>> import numpy as np
>>> a = [[True, False], [True, True], [True, True]]
>>> np.all(a, axis=0)
array([ True, False])

Here’s an example of computing the function along the second axis 1, collapsing the three rows using all() :

>>> np.all(a, axis=1)
array([False,  True,  True])

And here’s an example where we specify a tuple of axes (0, 1), essentially collapsing both axes:

>>> np.all(a, axis=(0, 1))

Practical Example with Video

Python One-Liners | Data Science 9 | NumPy all() + axis

Have you ever bought a product recommended by Amazon’s algorithms? Chances are that you are guilty of having purchased many such products. The recommendation algorithms are often based on a technique called “association analysis”.

In this example, you’ll learn more about the basic idea of association analysis and how to tip your toe into the deep ocean of recommender systems – all in a single line of NumPy code.

Association analysis is based on historical (customer) data. For instance, you may have already read the recommendation “People who bought X also bought Y” on Amazon. This association of different products is a powerful marketing concept because it does not only tie together related but complimentary products, but it also provides you with an element of “social proof” – the fact that other people have bought the product increases psychological safety for you to buy the product yourself. This is an excellent tool for marketers.

Let’s have a look at a practical example:

There are four persons Alice, Bob, Louis, and Larissa. Each person has bought different products (book, game, football, notebook, headphones). Say, we know every product bought by all four persons but not whether Louis has bought the notebook. What would you say: is Louis likely to buy the notebook?

Association analysis (or collaborative filtering) provides an answer to this problem. The underlying assumption is that if two persons performed similar actions in the past (e.g. bought a similar product), it is more likely that they keep performing similar actions in the future. If you look closely into above customer profiles, you will quickly realize that Louis has a similar buying behavior to Alice. Both Louis and Alice have bought the game and the football but not the headphones and the book. For Alice, we also know that she bought the notebook. Thus, the recommender system will predict that Louis is likely to buy the notebook, too.

The following code snippet simplifies this problem.

We consider the following problem: What’s the fraction of customers who bought both eBooks together?

## Dependencies
import numpy as np

## Data: row is customer shopping basket
## row = [course 1, course 2, ebook 1, ebook 2]
## value 1 indicates that an item was bought.
basket = np.array([[0, 1, 1, 0],
                   [0, 0, 0, 1],
                   [1, 1, 0, 0],
                   [0, 1, 1, 1],
                   [1, 1, 1, 0],
                   [0, 1, 1, 0],
                   [1, 1, 0, 1],
                   [1, 1, 1, 1]])

## One-liner
copurchases = np.sum(np.all(basket[:,2:], axis = 1)) / basket.shape[0]

## Result

What is the output of this code snippet?

The basket data array consists of customer data with one row per customer and one column per product (see the Figure above). Say, the first two products with column indices 0 and 1 are online courses and the latter two products with column indices 2 and 3 are eBooks. The value “1” in cell (i,j) indicates that customer i has bought the product j.

The problem is to find the fraction of customers who bought both eBooks (columns 2 and 3). In other words, we need to count the number of customers who have a value “1” at both columns 2 and 3. Thus, we first carve out the relevant columns from the original array to get the following sub-array:

[[1 0]
 [0 1]
 [0 0]
 [1 1]
 [1 0]
 [1 0]
 [0 1]
 [1 1]]

The slicing operation ensures that only the third and the fourth column – but all rows – remain in the array.

As you would intuitively guess, the NumPy all() function checks whether all values in a NumPy array evaluate to “True”. If this is the case, it returns “True”, otherwise it returns “False”. When used with the axis argument, the function performs this operation along the specified axis. Note that the axis argument is a recurring element for many different NumPy functions. Take your time to understand the axis argument properly: The specified axis is collapsed into a single value.

Thus, the result of applying the all() function on the sub-array is the following:

print(np.all(basket[:,2:], axis = 1))
# [False False False  True False False False  True]

In plain English: only the fourth and the last customers have bought both ebooks.

As we are interested in the fraction of customers, we sum over this Boolean array (side note: the Boolean value “True” is represented by an integer value of “1” and “False” by an integer value of “0”) and divide by the number of customers. The result is the fraction of customers who bought both eBooks (which is 0.25).

This example is drawn from my book Python One-Liners:

Python One-Liners Book: Master the Single Line First!

Python programmers will improve their computer science skills with these useful one-liners.

Python One-Liners

Python One-Liners will teach you how to read and write “one-liners”: concise statements of useful functionality packed into a single line of code. You’ll learn how to systematically unpack and understand any line of Python code, and write eloquent, powerfully compressed Python like an expert.

The book’s five chapters cover (1) tips and tricks, (2) regular expressions, (3) machine learning, (4) core data science topics, and (5) useful algorithms.

Detailed explanations of one-liners introduce key computer science concepts and boost your coding and analytical skills. You’ll learn about advanced Python features such as list comprehension, slicing, lambda functions, regular expressions, map and reduce functions, and slice assignments.

You’ll also learn how to:

  • Leverage data structures to solve real-world problems, like using Boolean indexing to find cities with above-average pollution
  • Use NumPy basics such as array, shape, axis, type, broadcasting, advanced indexing, slicing, sorting, searching, aggregating, and statistics
  • Calculate basic statistics of multidimensional data arrays and the K-Means algorithms for unsupervised learning
  • Create more advanced regular expressions using grouping and named groups, negative lookaheads, escaped characters, whitespaces, character sets (and negative characters sets), and greedy/nongreedy operators
  • Understand a wide range of computer science topics, including anagrams, palindromes, supersets, permutations, factorials, prime numbers, Fibonacci numbers, obfuscation, searching, and algorithmic sorting

By the end of the book, you’ll know how to write Python at its most refined, and create concise, beautiful pieces of “Python art” in merely a single line.

Get your Python One-Liners on Amazon!!

Leave a Comment