Pandas DataFrame Methods: idxmax(), idxmin(), reindex(), reindex_like(), rename(), rename_axis()

The Pandas DataFrame has several Re-indexing/Selection/Label Manipulations methods. When applied to a DataFrame, these methods evaluate, modify the elements and return the results.

This is Part 9 of the DataFrame methods series:

  • Part 1 focuses on the DataFrame methods abs(), all(), any(), clip(), corr(), and corrwith().
  • Part 2 focuses on the DataFrame methods count(), cov(), cummax(), cummin(), cumprod(), cumsum().
  • Part 3 focuses on the DataFrame methods describe(), diff(), eval(), kurtosis().
  • Part 4 focuses on the DataFrame methods mad(), min(), max(), mean(), median(), and mode().
  • Part 5 focuses on the DataFrame methods pct_change(), quantile(), rank(), round(), prod(), and product().
  • Part 6 focuses on the DataFrame methods add_prefix(), add_suffix(), and align().
  • Part 7 focuses on the DataFrame methods at_time(), between_time(), drop(), drop_duplicates() and duplicated().
  • Part 8 focuses on the DataFrame methods equals(), filter(), first(), last(), head(), and tail()
  • Part 9 focuses on the DataFrame methods equals(), filter(), first(), last(), head(), and tail()

Getting Started

Remember to add the Required Starter Code to the top of each code snippet. This snippet will allow the code in this article to run error-free.

Required Starter Code

import pandas as pd

Before any data manipulation can occur, this library will require installation:

  • The pandas library enables access to/from a DataFrame.

To install these libraries, navigate to an IDE terminal. At the command prompt ($), execute the code below. For the terminal used in this example, the command prompt is a dollar sign ($). Your terminal prompt may be different.

$ pip install pandas

Hit the <Enter> key on the keyboard to start the installation process.

Feel free to check out the correct ways of installing the library here:

If the installations were successful, a message displays in the terminal indicating the same.

DataFrame idxmax()

The idxmax() method checks for and returns the index of the first occurrence of the maximum index(es) over a selected axis.

The syntax for this method is as follows:

DataFrame.idxmax(axis=0, skipna=True)
ParameterDescription
axisIf zero (0) or index is selected, apply to each row. Default is None. If one (1) is selected, apply to each column.
skipnaIf set to True, NaN/NULL values display.

For this example, the DataFrame for Rivers Clothing depicts their inventory based on available sizes (index). Running this code will show the maximum (highest) indexes.

Code – Pandas Example:

df_inv = pd.DataFrame({'Tops':   [22, 12,  19,   8, 23],
                       'Pants':  [5,    7,    17,  19, 12],
                       'Coats':  [11,  18,   1,   16,  3]},
                       index =  ['XS','S', 'M', 'L', 'XL'])

result = df_inv.idxmax(axis=0)
print(result)
  • Line [1] creates a DataFrame from a dictionary of lists and saves it to df_inv.
  • Line [2] retrieves the location(s) of the maximum indexes across the rows. This output saves to the result variable.
  • Line [3] outputs the result to the terminal.

Output:

TopsXL
PantsL
CoatsS
dtype: object

For this example, a 5-day series of day-time highs record. This method returns the maximum temperature index.

Code – Series Example:

temps = pd.Series(data=[5, 11, 24, 35, 49],
                  index=['Day-1', 'Day-2', 'Day-3', 'Day-4', 'Day-5'])
print(temps.idxmax())
  • Line [1] creates a DataFrame from a dictionary of lists and saves it to df_inv.
  • Line [2] retrieves the location(s) of the maximum indexes. This output is printed right away.

Output:

Day-5

Note: The Numpy version of this method is numpy.argmax.

DataFrame idxmin()

The idxmin() method checks for and returns the index of the first occurrence of the minimum index(es) over a selected axis.

The syntax for this method is as follows:

DataFrame.idxmin(axis=0, skipna=True)
ParameterDescription
axisIf zero (0) or index is selected, apply to each row. Default is None. If one (1) is selected, apply to each column.
skipnaIf set to True, NaN/NULL values display.

For this example, the DataFrame for Rivers Clothing depicts their inventory based on available sizes (indexes). Running this code will show the minimum (lowest) indexes.

Code – Pandas Example:

df_inv = pd.DataFrame({'Tops':   [22, 12,  19,   8, 23],
                       'Pants':  [5,    7,    17,  19, 12],
                       'Coats':  [11,  18,   1,   16,  3]},
                       index =  ['XS','S', 'M', 'L', 'XL'])

result = df_inv.idxmin(axis=0)
print(result)
  • Line [1] creates a DataFrame from a dictionary of lists and saves it to df_inv.
  • Line [2] retrieves the location(s) of the minimum indexes across each row. This output saves to the result variable.
  • Line [3] outputs the result to the terminal.

Output:

TopsL
PantsXS
CoatsM
dtype: object

For this example, a 5-day series of day-time highs record. This method returns the minimum temperature index.

Code – Series Example:

temps = pd.Series(data=[5, None, 24, 35, 49],
                  index=['Day-1', 'Day-2', 'Day-3', 'Day-4', 'Day-5'])
print(temps.idxmin())
  • Line [1] creates a Series of temperatures and saves it to temps.
  • Line [2] retrieves the location(s) of the maximum indexes across the rows and outputs the result to the terminal.

Output:

Day-1

Note: The Numpy version of this method is numpy.argmin.

DataFrame reindex()

The reindex() method configures a DataFrame/Series to a new index. This method uses the parameter fill logic to replace the NaN/NULL values occurring from this operation.

The syntax for this method is as follows:

DataFrame.reindex(labels=None, index=None, columns=None, axis=None, method=None, 
                  copy=True, level=None, fill_value=nan, limit=None, tolerance=None)
ParameterDescription
labelsA list of indexes (label names) to work with the reindexed DataFrame/Series.
indexSee below.
columnsSee below.
axisIf zero (0) or index is selected, apply to each row. Default is None. If one (1) is selected, apply to each column.
methodOption to use when filling in NaN/NULL values when a reindex occurs. Available options are None, pad/ffill, backfill/bfill, or request.
copyIf True, reindex on and return a new DataFrame/Series else return a copy. By default, True
levelThe integer/name of the level if working with MultiIndex.
fill_valueFill value to use for NaN/NULL values.
limitMaximum number of elements to forward/backward fill.
toleranceMaximum distance from original labels and new labels for inexact matches.

Note: The DataFrame reindex() method has two (2) calling conventions:

  • (index=index_labels, columns=column_labels)
  • (labels, axis={'index', 'columns'})

For this example, Rivers Clothing wants to replace XL with XXS. Running the code below accomplishes this task.

df_inv = pd.DataFrame({'Tops':   [22, 12,  19,   8, 23],
                       'Pants':  [5,    7,    17,  19, 12],
                       'Coats':  [11,  18,   1,   16,  3]},
                       index =  ['XS', 'S',  'M',  'L',  'XL'])

new_index = ['XXS', 'XS', 'S', 'M', 'L']
result = df_inv.reindex(new_index, fill_value=0)
print(result)
  • Line [1] creates a DataFrame from a dictionary of lists and saves it to df_inv.
  • Line [2] does the following:
    • sets the new index for the DataFrame (adding in XXS and removing XL).
    • fills the vacant values and replaces them with zeros (0).
    • saves the output to result.
  • Line [3] outputs the result to the terminal.

Output:

 TopsPantsCoats
XXS000
XS22511
S12718
M19171
L81916

DataFrame reindex_like()

The reindex_like() method returns an object (DataFrame/Series) with matching indexes as another object (DataFrame/Series).

💡 Note: A new object (DataFrame/Series) creates unless the new index is the same as the current one and the copy parameter is False.

For this example, the DataFrames (df1 & df2) contain a 4-day/3-day daily forecast of the day-time stats such as Celsius and Fahrenheit, and Wind Speed.

df1 = pd.DataFrame([[24, 115, 'extreme'],
                    [31, 87,  'high'],
                    [22, 65,  'medium'],
                    [3,  9,   'low']],
                   columns=['Cel.', 'Fah.', 'Wind'],
                   index=pd.date_range(start='2014-02-12', end='2014-02-15', freq='D'))

df2 = pd.DataFrame([[8,  'low'],
                    [3,  'low'],
                    [54, 'medium']],
                   columns=['temp_celsius', 'windspeed'],
                   index=pd.DatetimeIndex(['2014-02-12', '2014-02-13', '2014-02-15']))

print(df1)
result = df2.reindex_like(df1)
print(result)
  • Line [1] creates a DataFrame with Celsius, Fahrenheit, and Wind for four (4)  days using date_range() and saves it to df1.
  • Line [2] creates a DataFrame with Celsius, Fahrenheit, and Wind for three (3) days using DateTimeIndex() and saves it to df2.
  • Line [3] outputs df1 to the terminal.
  • Line [4] performs a reindex_like() on the DataFrames and saves it to the result variable.
  • Line [5] outputs the result to the terminal.

Output:

df1
 Cel.Fah.Wind
2014-02-12   24115Extreme
2014-02-13   3187High
2014-02-14   22   65  Medium
2014-02-15    39low
df2
2014-02-12  8.0  NaNlow
2014-02-13  3.0NaNlow
2014-02-14  NaNNaNNaN
2014-02-15 1554.0  medium

DataFrame rename()

The rename() method changes the axis label(s) in a DataFrame/Series.

The syntax for this method is as follows:

DataFrame.rename(mapper=None, index=None, columns=None, axis=None, copy=True, inplace=False, level=None, errors='ignore')
ParameterDescription
mapperDictionary or function transformations to apply to an axis. Use mapper with axis to specify the axis.
indexRather than using the axis, you can set the index(es) to mapper.
columnsRather than using the axis, you can set the column(s) to mapper.
axisIf zero (0) or index is selected, apply to each row. Default is None. If one (1) is selected, apply to each column.
copyIf set to True, a copy creates. This parameter is True by default.
inplaceIf set to True, the changes apply to the original DataFrame. If False, the changes apply to a new DataFrame. By default, False.
fill_valueFill value to use for NaN/NULL values.
levelIf MultiIndex renames it on the selected axis.
errorsIf set to Raise, an error message will display, else ignore it. By default, Ignore.

For this example, the same 4-day forecast DataFrame used above modifies.

df = pd.DataFrame([[24, 115, 'extreme'],
                   [31, 87,  'high'],
                   [22, 65,  'medium'],
                   [3,  9,   'low']],
                  columns=['Cel.', 'Fah.', 'Wind'],
                  index=pd.date_range(start='2014-02-12', end='2014-02-15', freq='D'))

result = df.rename(columns={"Cel.": "Celsius", "Fah.": "Fahrenheit"})
print(result)
  • Line [1] creates a DataFrame with Celsius, Fahrenheit, and Wind for four (4) days using date_range() and saves it to df1.
  • Line [2] renames the columns to those set out in the columns parameter and saves it to the result variable.
  • Line [3] outputs the result to the terminal.

Output:

 Celsius Fahrenheit    Wind
2014-02-12   24115Extreme
2014-02-13   3187High
2014-02-14   22   65  Medium
2014-02-15    39low

DataFrame rename_axis()

The rename_axis() method works the same as rename(): it sets the name of the axis for the index or columns.

The syntax for this method is as follows:

DataFrame.rename_axis(mapper=None, index=None, columns=None, axis=None, copy=True, inplace=False)
ParameterDescription
mapperThe value to set the axis name.
indexA list, dictionary, or function applied to the selected axis.
columnsA list, dictionary, or function applied to the selected axis. The columns parameter ignores if the object is a Series.
axisIf zero (0) or index is selected, apply to each row. Default is None. If one (1) is selected, apply to each column.
copyIf set to True, a copy creates. This parameter is True by default.
inplaceIf set to True, the changes apply to the original DataFrame. If False, the changes apply to a new DataFrame. By default, False.

For this example, the same 4-day forecast DataFrame as above changes.

df = pd.DataFrame([[24, 115, 'extreme'],
                   [31, 87,  'high'],
                   [22, 65,  'medium'],
                   [3,  9,   'low']],
                  columns=['Cel.', 'Fah.', 'Wind'],
                  index=pd.date_range(start='2014-02-12', end='2014-02-15', freq='D'))

result = df.rename_axis("Dates")
print(result)
  • Line [1] creates a DataFrame with Celsius, Fahrenheit, and Wind for four (4) days using date_range() and saves it to df.
  • Line [2] renames the index and saves it to the result variable.
  • Line [3] outputs the result to the terminal.

Output:

DatesCel.Fah.   Wind
2014-02-12   24115Extreme
2014-02-13   3187High
2014-02-14   22   65  Medium
2014-02-15    39low