NumPy Matrix Multiplication — np.matmul() and @ [Ultimate Guide]

NumPy’s np.matmul() and the @ operator perform matrix multiplication. They compute the dot product of two arrays. For 2D arrays, it’s equivalent to matrix multiplication, while for higher dimensions, it’s a sum product over the last axis of the first array and the second-to-last of the second array.

Have you ever tried to multiply two NumPy arrays together and got a result you didn’t expect? NumPy’s multiplication functions can be confusing. In this article, we’ll explain everything you need to know about matrix multiplication in NumPy.

Watch the video where I go over the article in detail:

To perform matrix multiplication between 2 NumPy arrays, there are three methods. All of them have simple syntax. Let’s quickly go through them the order of best to worst. First, we have the @ operator

# Python >= 3.5
# 2x2 arrays where each value is 1.0
>>> A = np.ones((2, 2))
>>> B = np.ones((2, 2))

>>> A @ B
array([[2., 2.],
      [2., 2.]]) 

Next, np.matmul() 

>>> np.matmul(A, B)
array([[2., 2.],
      [2., 2.]]) 

And finally np.dot()

>>> np.dot(A, B)
array([[2., 2.],
      [2., 2.]]) 

Why are there so many choices? And which should you choose? Before we answer those questions, let’s have a refresher on matrix multiplication and NumPy’s default behavior.

np.matmul() vs np.dot() vs @ Matrix Multiplication Operators

What is Matrix Multiplication?

If you don’t know what matrix multiplication is, or why it’s useful, check out this short article.

Matrices and arrays are the basis of almost every area of research. This includes machine learning, computer vision and neuroscience to name a few. If you are working with numbers, you will use matrices, arrays and matrix multiplication at some point.

Now you know why it’s so important, let’s get to the code.

numpy.array — Default Behavior

The default behavior for any mathematical function in NumPy is element-wise operations. This is one advantage NumPy arrays have over standard Python lists.

Let’s say we have a Python list and want to add 5 to every element. To do this we’d have to either write a for loop or a list comprehension

# For loop - complicated and slow
>>> a = [1, 1, 1, 1]
>>> b = []
>>> for x in a:
        b.append(x + 5)
>>> b
[6, 6, 6, 6]

# List comprehension - nicer but still slow
>>> a = [1, 1, 1, 1]
>>> b = [x + 5 for x in a]
>>> b
[6, 6, 6, 6]

Both of these are slow and cumbersome. 

Instead, if A is a NumPy array it’s much simpler 

>>> A = np.array([1, 1, 1, 1])
>>> B = A + 5
>>> B
array([6, 6, 6, 6])

And much much much faster

# Using a list of length 1,000,000 for demonstration purposes
In [1]: a = list(range(100000))

In [2]: b = []

In [3]: %timeit for x in a: b.append(x + 5)
28.5 ms ± 5.71 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [4]: b = []

In [5]: %timeit b = [x+5 for x in a]
8.18 ms ± 235 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [6]: A = np.array(a)

In [7]: %timeit B = A + 5
81.2 µs ± 2 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Using arrays is 100x faster than list comprehensions and almost 350x faster than for loops. 

If we want to multiply every element by 5 we do the same

>>> C = A * 5
array([5, 5, 5, 5])

The same applies for subtraction and division. 

Every mathematical operation acts element wise by default. So if you multiply two NumPy arrays together, NumPy assumes you want to do element-wise multiplication.

>>> np.ones((2, 2)) * np.array([[1, 2], [3, 4]])
array([[1., 2.],
      [3., 4.]])

A core feature of matrix multiplication is that a matrix with dimension (m x n) can be multiplied by another with dimension (n x p) for some integers m, n and p. If you try this with *, it’s a ValueError

# This would work for matrix multiplication
>>> np.ones((3, 2)) * np.ones((2, 4))
ValueError: operands could not be broadcast together with shapes (3,2) (2,4)

This happens because NumPy is trying to do element wise multiplication, not matrix multiplication. It can’t do element wise operations because the first matrix has 6 elements and the second has 8.

Element wise operations is an incredibly useful feature.You will make use of it many times in your career. But you will also want to do matrix multiplication at some point.

Perhaps the answer lies in using the numpy.matrix class?

Numpy.matrix

There is a subclass of NumPy array called numpy.matrix. This operates similarly to matrices we know from the mathematical world. If you create some numpy.matrix instances and call *, you will perform matrix multiplication

# Element wise multiplication because they are arrays
>>> np.array([[1, 1], [1, 1]]) * np.array([[1, 2], [3, 4]])
array([[1, 2],
      [3, 4]])

# Matrix multiplication because they are matrices
>>> np.matrix([[1, 1], [1, 1]]) * np.matrix([[1, 2], [3, 4]])
matrix([[4, 6],
        [4, 6]])

But this causes some issues. 

For example, if you have 20 matrices in your code and 20 arrays, it will get very confusing very quickly. You may multiply two together expecting one result but get another. The * operator is overloaded. This results in code that is hard to read full of bugs. 

We feel that this is one reason why the Numpy docs v1.17 now say:

It is no longer recommended to use this class, even for linear algebra. Instead use regular arrays. The class may be removed in the future.

You may see this recommended in other places around the internet. But, as NumPy no longer recommends it, we will not discuss it further.

Now let’s look at some other methods.

Other Methods of Matrix Multiplication

There are 2 methods of matrix multiplication that involve function calls.

Let’s start with the one we don’t recommend

numpy.dot

As the name suggests, this computes the dot product of two vectors. It takes two arguments – the arrays you would like to perform the dot product on. There is a third optional argument that is used to enhance performance which we will not cover.

>>> vec1 = np.array([1, 2, 3])
>>> vec2 = np.array([3, 2, 1])

# Dot product is (1*3) + (2*2) + (3*1) = 3 + 4 + 3 = 10
>>> np.dot(vec1, vec2)
10

If you use this function with a pair of 2D vectors, it does matrix multiplication. 

>>> three_by_two = np.ones((3, 2))
>>> two_by_four = np.ones((2, 4))
>>> output = np.dot(three_by_two, two_by_four)

# We expect shape (3,2) x (2,4) = shape (3,4)
>>> output.shape
(3, 4)

# Output as expected from matrix multiplication
>>> output
array([[2., 2., 2., 2.],
      [2., 2., 2., 2.],
      [2., 2., 2., 2.]])

This method works but is not recommended by us or NumPy. One reason is because in maths, the ‘dot product’ has a specific meaning. It is very different from multiplication. It is confusing to these mathematicians to see np.dot() returning values expected from multiplication. 

There are times when you can, and should, use this function (e.g. if you want to calculate the dot product) but, for brevity, we refer you to the official docs.

So you should not use this function for matrix multiplication, what about the other one?

Numpy.matmul

np.matmul() Example 2D Matrix Multiplication

This is the NumPy MATrix MULtiplication function. Calling it with two matrices as the first and second arguments will return the matrix product. 

>>> three_by_two = np.ones((3, 2))
>>> two_by_four = np.ones((2, 4))
>>> output = np.matmul(three_by_two, two_by_four)

# Shape as expected from matrix multiplication
>>> output.shape
(3, 4)

# Output as expected from matrix multiplication
>>> output
array([[2., 2., 2., 2.],
      [2., 2., 2., 2.],
      [2., 2., 2., 2.]])

The function name is clear and it is quite easy to read. This is a vast improvement over np.dot(). There even are some advanced features you can use with this function. But for 90% of cases, this should be all you need. Check the docs for more info.

So is this the method we should use whenever we want to do NumPy matrix multiplication? No. We’ve saved the best ‘till last. 

Python @ Operator

The @ operator was introduced to Python’s core syntax from 3.5 onwards thanks to PEP 465. Its only goal is to solve the problem of matrix multiplication. It even comes with a nice mnemonic – @ is * for mATrices. 

One of the main reasons for introducing this was because there was no consensus in the community for how to properly write matrix multiplication. The asterisk * symbol was competing for two operations:

  • element wise multiplication, and
  • matrix multiplication.

The solutions were function calls which worked but aren’t very unreadable and are hard for beginners to understand. Plus research suggested that matrix multiplication was more common than // (floor) division. Yet this has its own syntax.

It is unusual that @ was added to the core Python language when it’s only used with certain libraries. Fortunately, the only other time we use @ is for decorator functions. So you are unlikely to get confused. 

It works exactly as you expect matrix multiplication to, so we don’t feel much explanation is necessary.

# Python >= 3.5
# 2x2 arrays where each value is 1.0
>>> A = np.ones((2, 2))
>>> B = np.ones((2, 2))

>>> A @ B
array([[2., 2.],
      [2., 2.]]) 

One thing to note is that, unlike in maths, matrix multiplication using @ is left associative.

If you are used to seeing

AZx

Where A and Z are matrices and x is a vector, you expect the operation to be performed in a right associative manner i.e.

A(Zx)

So you perform Zx first and then A(Zx). But all of Python’s mathematical operations are left associative.

>>> a + b + c = (a + b) + c
>>> a / b / c = (a / b) / c
>>> a * b - c = (a * b) - c

A numerical example

# Right associative
>>> 2 * (3 - 4)
-2

# Left associative
>>> (2 * 3) - 4
2

# Python is left associative by default
>>> 2 * 3 - 4
2

There was no consensus as to which was better. Since everything else in Python is left associative, the community decided to make @ left associative too.

So should you use @ whenever you want to do NumPy matrix multiplication?

Which Should You Choose?

There is some debate in the community as to which method is best. However, we believe that you should always use the @ operator. It was introduced to the language to solve the exact problem of matrix multiplication. There are many reasons detailed in PEP 465 as to why @ is the best choice.

The main reason we favour it, is that it’s much easier to read when multiplying two or more matrices together. Let’s say we want to calculate ABCD. We have two options

# Very hard to read
>>> np.matmul(np.matmul(np.matmul(A, B), C), D)

# vs

# Very easy to read
>>> A @ B @ C @ D

This short example demonstrates the power of the @ operator. The mathematical symbols directly translate to your code, there are less characters to type and it’s much easier to read.

Unfortunately, if you use an old version of Python, you’ll have to stick with np.matmul().

Summary

You now know how to multiply two matrices together and why this is so important for your Python journey.

If in doubt, remember that @ is for mATrix multiplication.

Where To Go From Here?

There are several other NumPy functions that deal with matrix, array and tensor multiplication. If you are doing Machine Learning, you’ll need to learn the difference between them all.

A good place to get a thorough NumPy education is the comprehensive Finxter NumPy tutorial on this blog and our new book Coffee Break NumPy.

Check out the following functions for more info:

REFERENCES

Do you want to become a NumPy master? Check out our interactive puzzle book Coffee Break NumPy and boost your data science skills! (Amazon link opens in new tab.)

Coffee Break NumPy

Daily Data Science Puzzle

[python]
import numpy as np

# graphics data
a = [[1, 1],
[1, 0]]
a = np.array(a)

# stretch vectors
b = [[2, 0],
[0, 2]]
b = np.array(b)
c = a @ b
d = np.matmul(a,b)
print((c == d)[0,0])
[/python]

What is the output of this puzzle?

Numpy is a popular Python library for data science focusing on arrays, vectors, and matrices.

This puzzle shows an important application domain of matrix multiplication: Computer Graphics.

We create two matrices a and b. The first matrix a is the data matrix (e.g. consisting of two column vectors (1,1) and (1,0)). The second matrix b is the transformation matrix that transforms the input data. In our setting, the transformation matrix simply stretches the column vectors.

More precisely, the two column vectors (1,1) and (1,0) are stretched by factor 2 to (2,2) and (2,0). The resulting matrix is therefore [[2,2],[2,0]]. We access the first row and second column.

We use matrix multiplication to apply this transformation. Numpy allows two ways for matrix multiplication: the matmul function and the @ operator.

Comparing two equal-sized numpy arrays results in a new array with boolean values. As both matrices c and d contain the same data, the result is a matrix with only True values.


Are you a master coder?
Test your skills now!

Related Video

Solution

2