NumPy Diff Simply Explained [Bonus Video]

The NumPy diff function np.diff() calculates the difference between subsequent values in a NumPy array. For example, calling np.diff() on the NumPy array [1 2 4] would result in the difference array [1 2].

Here is a detailed example:

import numpy as np

# Fibonacci Sequence with first 8 numbers
fibs = np.array([0, 1, 1, 2, 3, 5, 8, 13, 21])


diff_fibs = np.diff(fibs)
print(diff_fibs)
# [1 0 1 1 2 3 5 8]

This code snippet shows the most simple form of the np.diff() method: how to use it on a one-dimensional NumPy array. It calculates the difference between two subsequent values of a NumPy array. Hence, an array with n elements results in a diff array with n-1 elements.

Executing the NumPy Diff Method Multiple Times

We can also run the NumPy diff method multiple times by defining the argument n:

import numpy as np

a = np.array([2, 4, 7, 4, 1, 8, 11, 12])
print(np.diff(a, n=1))
# [ 2  3 -3 -3  7  3  1]

print(np.diff(a, n=2))
# [ 1 -6  0 10 -4 -2]

print(np.diff(a, n=3))
# [ -7   6  10 -14   2]

print(np.diff(a, n=4))
# [ 13   4 -24  16]

print(np.diff(a, n=5))
# [ -9 -28  40]

print(np.diff(a, n=6))
# [-19  68]

print(np.diff(a, n=7))
# [87]

print(np.diff(a, n=8))
# []

By defining the argument n, we can execute the diff function multiple times on the respective output of the last execution. Hence, the call np.diff(x, n=2) results in the same output as np.diff(np.diff(x)).

NumPy Diff with Two Axes

But what happens if you have a two-dimensional NumPy array? In other words, how does the diff function work with multiple axes?

Here is an example of how you can use the diff function to calculate the differences along the columns (axis=1):

import numpy as np

a = np.array([[0, 1, 1],
              [2, 3, 5],
              [8, 13, 21]])


diffs = np.diff(a, axis=1)
print(diffs)
"""
[[1 0]
 [1 2]
 [5 8]]
"""

You can see that each row with three columns is collapsed into a row with only two columns (the differences).

Let’s make it even more complex and combine the axis argument with the n argument for multiple diff executions in a single function call:

import numpy as np

a = np.array([[0, 1, 1],
              [2, 3, 5],
              [8, 13, 21]])


diffs = np.diff(a, n=2, axis=1)
print(diffs)
"""
[[-1]
 [ 1]
 [ 3]]
"""

In this puzzle, we use the axis argument axis=1 which means that we calculate the differences along the columns. For example, the first column results in the diff array [0 1].

When defining the parameter n, the diff function is applied n times to the output of the previous diff function execution. Thus, the first column undergoes the following transformations:

[0 1 1] diff--> [1 0] diff--> [-1]

Where to Go From Here?

Having a proficient Python education is critical for your success as a developer. You cannot hope to master data science if you do not even know the most basic Python and computer science concepts.

To this end, I have created a free Python email course (+ Bonus Cheat Sheet series). Subscribe if you need to refresh your basic Python knowledge! It’s fun!

If you’re already proficient in Python, study the NumPy library in-depth and kickstart your data science career with our LeanPub bestselling book “Coffee Break NumPy”!

2 thoughts on “NumPy Diff Simply Explained [Bonus Video]”

Leave a Comment

Your email address will not be published. Required fields are marked *