NumPy Sort [Ultimate Guide] - Be on the Right Side of Change

The np.sort(array) function returns a sorted copy of the specified NumPy array. Per default, it sorts the values in ascending order, so np.sort([42, 2, 21]) returns the NumPy array [2 21 42].

Here’s an example of 1D sorting:

>>> import numpy as np
>>> np.sort([42, 2, 21])
array([ 2, 21, 42])

And here’s an example of 2D sorting — each axis is sorted separately.

>>> np.sort([[4, 2, 5], 
             [3, 2, 6]])
array([[2, 4, 5],
       [2, 3, 6]])

An example of 3D sorting — only the most inner axis is sorted per default.

>>> np.sort([[[5, 4], [3, 1]], 
             [[9, 1], [6, 3]]])
array([[[4, 5],
        [1, 3]],

       [[1, 9],
        [3, 6]]])

Let’s dive into the NumPy sorting function slowly and thoroughly next!

Motivation

Imagine you need to find a book in your bookshelf. What situation would you prefer:

A) your bookshelf contains all your books in no specific order, or
B) your bookshelf contains all books alphabetically sorted by title.

Of course, option B) would save you a lot of time – especially if you access your bookshelf multiple times. This article will show you how to use sorting in a single line of Python using the NumPy library. The article is remotely based on book chapters from my book “Python One-liners”. ?

Sorting is at the heart of more advanced applications such as commercial computing, graph traversal, or search algorithms. Fortunately, NumPy provides different sorting algorithms – the default sorting algorithm being the popular “Quicksort” algorithm.

NumPy Sort Syntax

numpy.sort(a, axis=- 1, kind=None, order=None)

a – An array-like data structure to be sorted.
axis – An axis identifier as an integer along which the array should be sorted. If you set it to None, the array is flattened and then sorted. Per default, axis is set to -1 which sorts the array along the inner (last) axis.
kind – The sorting algorithm to be used. Can be any of the following: {'quicksort', 'mergesort', 'heapsort', 'stable'}. Per default, it uses 'quicksort'.
order – On an array with defined fields attribute, it specifies which fields to compare in which order.

NumPy Sort Runtime Complexity

The runtime complexity of the np.sort() function depends on the sorting algorithm defined in the kind argument. Per default, NumPy uses the Quicksort algorithm which has quadratic worst-case runtime complexity, but O(n * log(n)) on average.

Here are the different variants as compiled from this docs source:

`kind` Argument	Runtime Complexity	Space Complexity
`'quicksort'`	O(n^2)	0
`'heapsort'`	O(nlog(n))*	0
`'mergesort'`	O(nlog(n))*	~n/2
`'timsort'`	O(nlog(n))*	~n/2

NumPy Sort vs Argsort

The difference between np.sort() and np.argsort() is that the former returns a sorted array copy and the latter returns an array of indices that define how to obtain the sorted array from the original array.

I’ll give you an example next. Conceptually, you can view sorting as a “black box” where you can put in a NumPy array and get out a sorted NumPy array.

sort() vs argsort() NumPy — **Figure**: `np.sort()` returns the sorted array whereas `np.argsort()` returns an array of the corresponding indices.

The figure shows how the algorithm transforms an unsorted array [10, 6, 8, 2, 5, 4, 9, 1] into a sorted array [1, 2, 4, 5, 6, 8, 9, 10]. This is the purpose of NumPy’s sort() function.

But oftentimes it is not only important to sort the array itself, but also to get the array of indices that would transform the unsorted array into a sorted array. For example, the array element “1” of the unsorted array has index “7”. Since the array element “1” is the first element of the sorted array, its index “7” is the first element of the sorted indices. This is the purpose of NumPy’s argsort() function.

This small code snippet demonstrates how you would use sort() and argsort() in NumPy:

import numpy as np


a = np.array([10, 6, 8, 2, 5, 4, 9, 1])

print(np.sort(a))
# [ 1  2  4  5  6  8  9 10]

print(np.argsort(a))
# [7 3 5 4 1 2 6 0]

Sorting Along an Axis

You may ask: how is NumPy’s sort() function different to Python’s sorted() function? The answer is simple: you can use NumPy to sort multi-dimensional arrays, too!

The Figure shows two ways of how to use the sorting function to sort a two-dimensional array. The array to be sorted has two axes: axis 0 (the rows) and axis 1 (the columns). Now, you can sort along axis 0 (vertically sorted) or along axis 1 (horizontally sorted). In general, the axis keyword defines the direction along which you perform the NumPy operation.

Here is the code snippet that shows technically how to do this:

import numpy as np


a = np.array([[1, 6, 2],
              [5, 1, 1],
              [8, 0, 1]])

print(np.sort(a, axis=0))
"""
[[1 0 1]
 [5 1 1]
 [8 6 2]]
"""

print(np.sort(a, axis=1))
"""
[[1 2 6]
 [1 1 5]
 [0 1 8]]
"""

The example shows that the optional axis argument helps you sort the NumPy array along a fixed direction. This is the main strength of NumPy’s sort() function compared to Python’s built-in sorted() function.

Practical Example

The one-liner solves the following problem: “Find the names of the top three students with highest SAT scores.”

Note that simply sorting an array of SAT scores does not solve the problem because the problem asks for the names of the students. Have a look at the data first and then try to find the one-liner solution yourself.

## Dependencies
import numpy as np


## Data: SAT scores for different students
sat_scores = np.array([1100, 1256, 1543, 1043, 989, 1412, 1343])
students = np.array(["John", "Bob", "Alice", "Joe", "Jane", "Frank", "Carl"])


## One-liner
top_3 = students[np.argsort(sat_scores)][:3:-1]


## Result
print(top_3)

Exercise: What’s the output of this code snippet?

Initially, the code defines the data consisting of the SAT scores of students as a one-dimensional data array, as well as the names of these students. For example, student “John” achieved a SAT score of “1100”, while “Frank” achieved a SAT score of “1343”.

The question is to find the names of the three most successful students. The one-liner achieves this objective – not by simply sorting the SAT scores – but by running the argsort() function. Recall that the argsort() function returns an array of indices such that the respective data array elements would be sorted.

Here is the output of the argsort function on the SAT scores:

print(np.argsort(sat_scores))
# [4 3 0 1 6 5 2]

Why is the index “4” at the first position of the output? Because student “Jane” has the lowest SAT score with 989 points. Note that both sort() and argsort() sort in an ascending manner from lowest to highest values.

You have the sorted indices but what now? The idea is to get the names of the respective students. Now, this can be achieved by using simple indexing on the student’s name array:

print(students[np.argsort(sat_scores)])
# ['Jane' 'Joe' 'John' 'Bob' 'Carl' 'Frank' 'Alice']

You already know that “Jane” has the lowest SAT score, while “Alice” has the highest SAT score. The only thing left is to reorder this list (from highest to lowest) and extract the top three students using simple slicing:

## One-liner
top_3 = students[np.argsort(sat_scores)][:3:-1]


## Result
print(top_3)
# ['Alice' 'Frank' 'Carl']

Alice, Frank, and Carl are the students with the highest SAT scores 1543, 1412, and 1343, respectively.

Python One-Liners Book: Master the Single Line First!

Python programmers will improve their computer science skills with these useful one-liners.

Python One-Liners will teach you how to read and write “one-liners”: concise statements of useful functionality packed into a single line of code. You’ll learn how to systematically unpack and understand any line of Python code, and write eloquent, powerfully compressed Python like an expert.

The book’s five chapters cover (1) tips and tricks, (2) regular expressions, (3) machine learning, (4) core data science topics, and (5) useful algorithms.

Detailed explanations of one-liners introduce key computer science concepts and boost your coding and analytical skills. You’ll learn about advanced Python features such as list comprehension, slicing, lambda functions, regular expressions, map and reduce functions, and slice assignments.

You’ll also learn how to:

Leverage data structures to solve real-world problems, like using Boolean indexing to find cities with above-average pollution
Use NumPy basics such as array, shape, axis, type, broadcasting, advanced indexing, slicing, sorting, searching, aggregating, and statistics
Calculate basic statistics of multidimensional data arrays and the K-Means algorithms for unsupervised learning
Create more advanced regular expressions using grouping and named groups, negative lookaheads, escaped characters, whitespaces, character sets (and negative characters sets), and greedy/nongreedy operators
Understand a wide range of computer science topics, including anagrams, palindromes, supersets, permutations, factorials, prime numbers, Fibonacci numbers, obfuscation, searching, and algorithmic sorting

By the end of the book, you’ll know how to write Python at its most refined, and create concise, beautiful pieces of “Python art” in merely a single line.

Get your Python One-Liners on Amazon!!