In Python, the `numpy.diff()`

function **calculates the n-th discrete difference between adjacent values in an array along with a given axis**. For higher-order differences calculation, `numpy.diff()`

runs recursively to the output of the previous execution.

Here is the argument table of `numpy.diff()`

:

If it sounds great to you, please continue reading, and you will fully understand the `numpy.diff()`

function through Python code snippets and vivid visualization.

This tutorial is about `numpy.diff()`

function.

- Concretely, I will introduce its syntax and arguments.
- Then, you will learn some basic examples of this function.
- Finally, I will address three top questions about
`numpy.diff()`

, including`np.diff`

`prepend`

,`np.diff`

vs.`np.gradient`

, and`np.diff`

datetime.

You can find all codes in this tutorial here.

## Syntax and Arguments

Here is the syntax of `numpy.diff()`

:

numpy.diff(a, n=1, axis=-1, prepend=<no value>, append=<no value>)

### Argument Table

Here’s the same table for copy&pasting:

Argument | Accept | Description |

`a` | `array_like` | Input array |

`n` | `int` , optional | The number of times values are differenced. If zero, the input is returned as-is. |

`axis` | `int` , optional | The axis along which the difference is taken, default is the last axis. |

`prepend` , `append` | `array_like` or scalar values, optional | Values to prepend or append to `a` along axis prior to performing the difference. Scalar values are expanded to arrays with length 1 in the direction of axis and the shape of the input array in along all other axes. Otherwise, the dimension and shape must match `a` except along axis. |

## Basic Examples

As mentioned before, for higher-order differences calculation, `numpy.diff()`

runs recursively to the output of the previous execution.

This function may sound abstract, but I’ve been there before. Let me help you understand this step by step!

### “0” difference and 1st difference in a one-dimensional array

Here are the “0” difference and 1st difference in a one-dimensional array code examples:

import numpy as np # “0” difference and 1st difference in one-dimensional array example ''' The first difference is given by out[i] = a[i+1] - a[i] along the given axis, higher differences are calculated by using diff recursively. ''' one_dim = np.array([1, 2, 4, 7, 12]) it_self = np.diff(one_dim, n=0) one_diff = np.diff(one_dim, n=1) print(f'One dimensional array: {one_dim}') print(f'"0" difference: {it_self}') print(f'1st difference: {one_diff}')

Output:

### 2nd difference and 3rd difference in a one-dimensional array

Here are the 2nd difference and 3rd difference in a one-dimensional array code examples:

import numpy as np # 2nd difference and 3rd difference example ''' The first difference is given by out[i] = a[i+1] - a[i] along the given axis, higher differences are calculated by using diff recursively. ''' one_dim = np.array([1, 2, 4, 9, 15, 20]) one_diff = np.diff(one_dim, n=1) two_diff = np.diff(one_dim, n=2) three_diff = np.diff(one_dim, n=3) print(f'One dimensional array: {one_dim}') print(f'1st difference: {one_diff}') print(f'2nd difference: {two_diff}') print(f'3rd difference: {three_diff}')

Output:

### 2nd difference in a two-dimensional array with axis = 0

Here is the 2nd difference in a two-dimensional array with `axis = 0`

example:

import numpy as np # 2nd difference in two-dimensional array example - axis=0 ''' The first difference is given by out[i] = a[i+1] - a[i] along the given axis, higher differences are calculated by using diff recursively. ''' two_dim = np.array([[1, 2, 4, 9, 15, 20], [4, 2, 1, 0, 24, 8], [3, 7, 5, 13, 17, 0]]) one_diff = np.diff(two_dim, n=1, axis=0) two_diff = np.diff(two_dim, n=2, axis=0) print(f'Two dimensional array: {two_dim}') print('-'*85) print(f'1st difference: {one_diff}') print('-'*85) print(f'2nd difference: {two_diff}')

Output:

### 2nd difference in a two-dimensional array with axis = 1

Here is the 2nd difference in a two-dimensional array with `axis = 1`

example:

import numpy as np # 2nd difference in two-dimensional array example - axis=1 ''' The first difference is given by out[i] = a[i+1] - a[i] along the given axis, higher differences are calculated by using diff recursively. ''' two_dim = np.array([[1, 2, 4, 9, 15, 20], [4, 2, 1, 0, 24, 8], [3, 7, 5, 13, 17, 0]]) one_diff = np.diff(two_dim, n=1, axis=1) two_diff = np.diff(two_dim, n=2, axis=1) print(f'Two dimensional array: {two_dim}') print('-'*85) print(f'1st difference: {one_diff}') print('-'*85) print(f'2nd difference: {two_diff}')

Output:

Now, I hope you understand how `numpy.diff()`

works in higher-order differences calculation and how the `axis`

argument helps manipulate the calculation direction.

Let’s now dive into top questions regarding this function and gain further understanding!

## np.diff() prepend

First, many people find the argument `prepend`

and append in this function hard to understand.

Since these two arguments work pretty similarly, I will help you comprehend the `prepend`

argument in this part and leave you to figure out the `append`

argument yourself 🙂

Here is our previous argument table, where you can find the description of the `prepend`

argument.

From the above, we can see that there are two ways, the array way and scalar values way, to prepend values to an along axis before performing the difference calculation.

Here is the array way:

import numpy as np # prepend with array - axis=0 two_dim = np.array([[1, 2, 4, 9, 15, 20], [4, 2, 1, 0, 24, 8], [3, 7, 5, 13, 17, 0]]) one_diff = np.diff(two_dim, n=1, axis=0, prepend=[[1] * two_dim.shape[1]]) two_diff = np.diff(two_dim, n=2, axis=0, prepend=[[1] * two_dim.shape[1]]) # one_diff = np.diff(two_dim, n=1, axis=0, prepend=[[1, 1, 1, 1, 1, 1]]) # two_diff = np.diff(two_dim, n=2, axis=0, prepend=[[1, 1, 1, 1, 1, 1]]) print(f'Two dimensional array: {two_dim}') print('-'*85) print(f'1st difference: {one_diff}') print('-'*85) print(f'2nd difference: {two_diff}')

Output:

Here is the scalar values way:

# prepend with scalar values - axis=0 import numpy as np two_dim = np.array([[1, 2, 4, 9, 15, 20], [4, 2, 1, 0, 24, 8], [3, 7, 5, 13, 17, 0]]) one_diff = np.diff(two_dim, n=1, axis=0, prepend=1) two_diff = np.diff(two_dim, n=2, axis=0, prepend=1) # one_diff = np.diff(two_dim, n=1, axis=0, prepend=[[1, 1, 1, 1, 1, 1]]) # two_diff = np.diff(two_dim, n=2, axis=0, prepend=[[1, 1, 1, 1, 1, 1]]) print(f'Two dimensional array: {two_dim}') print('-'*85) print(f'1st difference: {one_diff}') print('-'*85) print(f'2nd difference: {two_diff}')

Output:

In conclusion, you can either pass a scalar value or an array to prepend or append to an along axis prior to performing the difference calculation.

It is easier to pass a scalar value if you just want to prepend or append the same values. And the array option gives you the flexibility to structure any values that you want to prepend or append.

## np.diff() vs np.gradient()

Another confusing point about this function is its difference from another function, `numpy.gradient()`

.

- Simply put,
`numpy.diff()`

calculates the n-th discrete differences between adjacent values along a given axis and only involves subtraction mathematically. - However,
`numpy.gradient()`

calculates the gradient of an N-dimensional array and involves subtraction and division mathematically.

For `numpy.gradient()`

function, the gradient is computed using second order accurate central differences in the interior points and either first or second order accurate one-sides (forward or backwards) differences at the boundaries. The returned gradient hence has the same shape as the input array.

Intuitively, the `numpy.gradient()`

function is used to measure the change rate in an N-dimensional array, which is like the slope concept in a two-dimensional plane.

To be honest, the `numpy.gradient()`

is another hard-to-understand function. If you’d like me to write another article about it, please let me know! 🙂

For now, I hope you know intuitively what the difference is between these two functions.

## np.diff() datetime

In our previous examples, we have only dealt with numerical values. Good news! The `np.diff()`

method can also be used to handle `datetime`

format arrays!

Here is the example of handling `datetime`

format arrays:

import numpy as np ''' Generally, the type of the np.diff()’s output is the same as the type of the difference between any two elements of input array. A notable exception is datetime64, which results in a timedelta64 output array. ''' # dates = np.arange('1100-10-01', '1100-10-05', dtype=np.datetime64) # one_diff = np.diff(dates, n=1) dates = np.arange('2066-10-13', '2066-10-16', dtype=np.datetime64) one_diff = np.diff(dates, n=1) print(f'Original dates: {dates}') print('-'*85) print(f'Original date\'s type: {dates.dtype}') print('-'*85) print(f'One difference: {one_diff}') print('-'*85) print(f'One difference\'s type: {one_diff.dtype}')

Output:

Please be aware that generally, the type of the `np.diff()`

’s output is the same as the type of the difference between any two elements of input array.

A notable exception is `datetime64`

, right here, which results in a `timedelta64`

output array.

## Summary

That’s it for our `np.diff()`

article.

We learned about its syntax, arguments, and basic examples.

We also worked on the top three questions about the `np.diff()`

function, ranging from `np.diff prepend`

, `np.diff`

vs. `np.gradient`

, and `np.diff datetime`

.

Hope you enjoy all this and happy coding!

Anqi Wu is an aspiring Data Scientist and enthusiastic Python Freelancer. She is an incoming student for a Master’s program in Analytics and builds her Python Freelancer profile on Upwork.

Anqi is passionate about machine learning, statistics, data mining, programming, and many other data science related fields. She has proven her expertise during her undergraduate years, including multiple winning and top placements in mathematical modeling contests. She loves supporting and enabling data-driven decision-making, developing data services, and teaching.

She is skilled at programming languages like Python, R, and SQL, actively delving into the world of Machine Learning and Deep Learning and traveling along her data science journey with joy. Data sensitivity and business acumen are her advantages to march towards the career path as a data scientist 🙂

Here is a link to the author’s website: https://www.anqiwu.one/. She uploads data science blogs weekly to document her data science learning and practicing for the past week, along with some best learning resources and inspirational thoughts.

I hope you enjoy this article! Cheers!