How to Calculate the Standard Deviation in NumPy?

Problem Formulation: How to calculate the standard deviation in NumPy?

Differentiations: There are many different variants of this problem:

Calculate the standard deviation of a 1D array
Calculate the standard deviation of a 2D array
Calculate the standard deviation of a 3D array

Then you can also calculate the standard deviation along an axis:

Calculate the standard deviation of a 2D array along the columns
Calculate the standard deviation of a 2D array along the rows

All of them use the np.std(array, axis) function that can be customized to the problem at hand.

Syntax: np.std(array, axis=0)

Argument	`array-like`	Array for which the standard deviation should be calculated
Argument	`axis`	Axis along which the standard deviation should be calculated. Optional.
Return Value	`array` or `number`	If no axis argument is given (or is set to 0), returns a number. Otherwise returns the standard deviation along the axis which is a NumPy array with a dimensionality reduced by one.

How to Calculate the Standard Deviation in NumPy?

Before we dive into the different ways to calculate the standard deviation in NumPy, let me quickly give you a hint that there are additional optional arguments—but most of them are little-used. You can check them out here.

How to calculate the standard deviation of a 1D array

import numpy as np

arr = np.array([0, 10, 0])
dev = np.std(arr)

print(dev)
# 4.714045207910316

How to calculate the standard deviation of a 2D array

import numpy as np

arr = np.array([[1, 2, 3],
                [1, 1, 1]])
dev = np.std(arr)
print(dev)
# 0.7637626158259734

How to calculate the standard deviation of a 3D array

import numpy as np

arr = np.array([[[1, 1], [0, 0]],
                [[0, 0], [0, 0]]])
dev = np.std(arr)
print(dev)
# 0.4330127018922193

You can pass an n-dimensional array and NumPy will just calculate the standard deviation of the flattened array.

How to calculate the standard deviation of a 2D array along the columns

import numpy as np

matrix = [[1, 2, 3],
          [2, 2, 2]]

# calculate standard deviation along columns
y = np.std(matrix, axis=0)
print(y)
# [0.5 0.  0.5]

How to calculate the standard deviation of a 2D array along the rows

import numpy as np

matrix = [[1, 2, 3],
          [2, 2, 2]]

# calculate standard deviation along rows
z = np.std(matrix, axis=1)
print(z)
# [0.81649658 0.]

Data Science NumPy Puzzle

import numpy as np

# daily stock prices
# [open, close]
google = np.array(
    [[1239, 1258], # day 1
     [1262, 1248], # day 2
     [1181, 1205]]) # day 3

# standard deviation
y = np.std(google, axis=1)

print(y[2] == max(y))

What is the output of this puzzle?
*Advanced Level*

You can solve the puzzle in our interactive Finxter app here:

Numpy is a popular Python library for data science focusing on arrays, vectors, and matrices.

This puzzle introduces the standard deviation function of the NumPy library. When applied to a 1D array, this function returns its standard deviation. When applied to a 2D array, NumPy simply flattens the array. The result is the standard deviation of the flattened 1D array.

In the puzzle, we have a matrix with three rows and two columns. The matrix stores the open and close prices of the Google stock for three consecutive days. The first column specifies the opening price, the second the closing price.

We are interested in the standard deviation of the three days. How much does the stock price deviate from the mean between the opening and the closing price?

Numpy provides this functionality via the axis parameter. In a 2D matrix, the row is specified as axis=0 and the column as axis=1. We want to compute the standard deviation along the column, i.e., axis=1. This results in three standard deviation values – one per each day.

Clearly, on the third day, we have observed the highest standard deviation.

Are you a master coder?
Test your skills now!