In Python, the `numpy.argpartition()`

function **returns the indices that would partition an array along with a given axis based on the specified kth element(s).**

All elements smaller than the kth element will be moved before it and all larger elements behind it. The element order in the partitions is undefined.

If provided with a sequence of kth, it will partition all of them into their sorted position at once.

Here is the argument table of `np.argpartition()`

.

If it sounds great to you, please continue reading, and you will fully understand the `np.argpartition()`

function through Python code snippets and vivid visualization.

This tutorial is about `np.argpartition()`

function.

- Concretely, I will introduce its syntax and arguments.
- Then, you will learn some basic examples of this function.
- Finally, I will address three top questions about
`np.argpartition()`

, including`np.argpartition`

2d array/axis, np.argpartition order, and`np.argpartition`

ignore`np.nan`

.

You can find all codes in this tutorial here.

Table of Contents

## Syntax and Arguments

Here is the syntax of `np.argpartition()`

:

# Syntax numpy.argpartition(a, kth[, axis=- 1[, kind='introselect'[, order=None]]])

Here is the argument table of `np.argpartition()`

:

Argument | Accept | Description |

`a` | `array_like` | Array to sort. |

`kth` | `int` or sequence of ints | Element index to partition by. All smaller elements will be moved before it and all larger elements behind it. The element order in the partitions is undefined. If provided with a sequence of k-th, it will partition all of them into their sorted position at once. |

`axis` | `int` or `None` , optional | Axis along which to sort. The default is -1 (the last axis). If `None` , the flattened array is used. |

`kind` | `{'introselect'}` , optional | Selection algorithm. Default is `'introselect'` |

`order` | `str` or list of `str` , optional | This argument specifies the order in which to compare fields. A single field can be specified as a string, and not all fields need to be specified, but unspecified fields will still be used, in the order in which they come up in the `dtype` , to break ties. |

Don’ panic! I wil explain the kth, axis, and order argument clearly later!

The **output** of `np.argpartition()`

function is an `index_array`

. If the input array `a`

is one-dimensional, `a[index_array]`

yields a partitioned `a`

. Generally, `np.take_along_axis(a, index_array, axis=a)`

always yields the partitioned `a`

, irrespective of dimensionality.

## Basic Examples

Here is a one-dimensional array code example:

# Basic Example import numpy as np one_dim = np.array([2, 3, 1, 5, 4]) # kth=0 -> partition based on 2(zero index). partitioned = np.argpartition(one_dim, 0) print(f'Unpartitioned array: {one_dim}') print(f'Partitioned array index: {partitioned}') print(f'Partitioned array: {one_dim[partitioned]}')

Output:

As you can see, the partitioned array does not guarantee the element order in the partitions. It just separates elements into groups based on the kth argument!

This is a way to do partial sort and has an efficient time complexity O(n).

If you are more interested in sorting array as a whole, you might want to check out the `numpy.argsort()`

function and here is my tutorial on numpy.argsort().

## np.argpartition() 2d array/axis

In this part, I will show you how to deploy the argument axis with 2d array. Of course, you can extrapolate this into an n-dimensional array!

Just for your quick reference, here is the argument table of `np.argpartition()`

:

Here is the 2d array code example:

import numpy as np # axis = 0 [partial sort along the axis = 0. In this case, row-like.] print('axis = 0') two_dim = np.array([[1, 4, 3], [3, 2, 1]]) partitioned = np.argpartition(two_dim, kth=0, axis=0) print(f'Unpartitioned array: {two_dim}') print(f'Partitioned array index: {partitioned}') print(f'Partitioned array: {np.take_along_axis(two_dim, partitioned, axis=0)}') # axis = 1 [partial sort along the axis = 1. In this case, column-like.] print('-'*85) print('axis = 1') two_dim = np.array([[1, 4, 3], [3, 2, 1]]) partitioned = np.argpartition(two_dim, kth=0, axis=1) print(f'Unpartitioned array: {two_dim}') print(f'Partitioned array index: {partitioned}') print(f'Partitioned array: {np.take_along_axis(two_dim, partitioned, axis=1)}')

Output:

## np.argpartition() order

In this part, I will show you another exciting argument, order.

Just for your quick reference, here is the argument table of `np.argpartition()`

:

Here is a one-dimensional array code example:

import numpy as np # order: x -> y print('order: x -> y') one_dim = np.array([(1, 3), (2, 1), (1, 1), (2, 3)], dtype=[('x', float), ('y', float)]) partitioned = np.argpartition(one_dim, kth=0, order=['x', 'y']) # In this case, same as: # partitioned = np.argpartition(one_dim, kth=0) print(f'Unpartitioned array: {one_dim}') print(f'Partitioned array index: {partitioned}') print(f'Partitioned array: {one_dim[partitioned]}') # order: y -> x print('-'*85) print('order: y -> x') one_dim = np.array([(1, 3), (2, 1), (1, 1), (2, 3)], dtype=[('x', float), ('y', float)]) partitioned = np.argpartition(one_dim, kth=0, order=['y', 'x']) print(f'Unpartitioned array: {one_dim}') print(f'Partitioned array index: {partitioned}') print(f'Partitioned array: {one_dim[partitioned]}')

Output:

## np.argpartition() ignore np.nan

When we encounter tricky some `np.nan`

in a sorting problem, we can exclude them by counting the number of `np.nan`

and leaving them out through indexing.

Here is a case of getting the N biggest element and index in an array containing `np.nan`

:

import numpy as np # * Do not handle np.nan problem: print('Do not handle np.nan problem:') one_dim = np.array([2, 3, 1, np.nan, 5, 4, np.nan]) # kth=0 -> partition based on 2(zero index). partitioned = np.argpartition(one_dim, 0) print(f'Unpartitioned array: {one_dim}') print(f'Partitioned array index: {partitioned}') print(f'Partitioned array: {one_dim[partitioned]}') # * Handle np.nan problem: # np.nan is taken as an infinite large number in Python. print('-'*85) print('Handle np.nan problem:') one_dim = np.array([2, 3, 1, np.nan, 5, 4, np.nan]) # If we want to get the N biggest element and index in an array: print('(If we want to get the N biggest element and index in an array)') N = 2 # get the number of np.nan c = np.isnan(one_dim).sum() partitioned_idx = np.argpartition(one_dim, -N-c)[-N-c:-c] partitioned_val = one_dim[partitioned_idx] print(f'Unpartitioned array: {one_dim}') print(f'Partitioned array index: {partitioned_idx}') print(f'Partitioned array: {partitioned_val}')

Output:

## Summary

That’s it for our `np.argpartition()`

article.

We learned about its syntax, arguments, and basic examples.

We also worked on the top three questions about the `np.argpartition()`

function, ranging from `np.argpartition`

2d array/axis, `np.argpartition`

order, and `np.argpartition`

ignore `np.nan`

.

By the way, if you want to do a direct partial sort, please refer to the `numpy.partition()`

function. I believe that after reading this tutorial, you can understand `numpy.partition()`

easily!

Hope you enjoy all this and happy coding!

Anqi Wu is an aspiring Data Scientist and enthusiastic Python Freelancer. She is an incoming student for a Master’s program in Analytics and builds her Python Freelancer profile on Upwork.

Anqi is passionate about machine learning, statistics, data mining, programming, and many other data science related fields. She has proven her expertise during her undergraduate years, including multiple winning and top placements in mathematical modeling contests. She loves supporting and enabling data-driven decision-making, developing data services, and teaching.

She is skilled at programming languages like Python, R, and SQL, actively delving into the world of Machine Learning and Deep Learning and traveling along her data science journey with joy. Data sensitivity and business acumen are her advantages to march towards the career path as a data scientist 🙂

Here is a link to the author’s website: https://www.anqiwu.one/. She uploads data science blogs weekly to document her data science learning and practicing for the past week, along with some best learning resources and inspirational thoughts.

I hope you enjoy this article! Cheers!