How to Set and Reset Pandas DataFrame Indexes - Be on the Right Side of Change

The set_index( ) and reset_index( ) methods are used on top of a Pandas DataFrame to manipulate its index column.

The method set_index( ) is used to set the index of the DataFrame from the existing columns.
The method reset_index( ) is used to get back to the default index of the dataset.

Pandas set_index example

Let us create a Pandas DataFrame to show a basic example usage of the set_index() method.

Assume that a survey is conducted on various programmers to observe some patterns. The data collected in the survey are;

What are their names?
What’s their job category asking whether they’re freelancers or full-time jobholders?
What is the programming language of their choice at work?
What is their experience in the number of years?
Which country do they belong to?

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({
   ...:     "name": ['Chris', 'Priyatham', 'Alice', 'Bob'],
   ...:     "category": ['freelancer', 'freelancer', 'fulltime_job', 'fulltime_job'],
   ...:     "prog_lang": ['Python', 'C', 'Python', 'C'],
   ...:     "exp": [5, 2, 15, 15],
   ...:     "country": ['Germany', 'India', 'France', 'USA']
   ...: })

In [3]: df
Out[3]: 
        name      category prog_lang  exp  country
0      Chris    freelancer    Python    5  Germany
1  Priyatham    freelancer         C    2    India
2      Alice  fulltime_job    Python   15   France
3        Bob  fulltime_job         C   15      USA

Let’s have a look at the set_index method’s documentation:

It is observed that set_index( ) is a method on top of DataFrame. There are four major parameters to the set_index( ) method,

keys
drop
append
inplace

So, if we would like to make the name column of the above DataFrame as the index. It can be done by passing the column name as keys parameter to the set_index( ) method,

In [4]: indexed_df = df.set_index('name')

In [5]: indexed_df
Out[5]: 
               category prog_lang  exp  country
name                                           
Chris        freelancer    Python    5  Germany
Priyatham    freelancer         C    2    India
Alice      fulltime_job    Python   15   France
Bob        fulltime_job         C   15      USA

Pandas set_index inplace

If you observe the above process of setting index, the set_index method is generating a new DataFrame. Out of the four major parameters, we can use inplace to set the index of the same DataFrame. It’s a boolean value and set to False by default, which needs to be changed to True.

It can be done so by the following code;

In [6]: indexed_df_inplace = df.copy()

In [7]: indexed_df_inplace
Out[7]: 
        name      category prog_lang  exp  country
0      Chris    freelancer    Python    5  Germany
1  Priyatham    freelancer         C    2    India
2      Alice  fulltime_job    Python   15   France
3        Bob  fulltime_job         C   15      USA

In [8]: indexed_df_inplace.set_index('name', inplace=True)

In [9]: indexed_df_inplace
Out[9]: 
               category prog_lang  exp  country
name                                           
Chris        freelancer    Python    5  Germany
Priyatham    freelancer         C    2    India
Alice      fulltime_job    Python   15   France
Bob        fulltime_job         C   15      USA

You can see in the above code, indexed_df_inplace DataFrame changed its RangeIndex to normal NamedIndex.

Whenever setting the index using the set_index method, the column of the DataFrame drops and becomes index. It’s because the default value of the drop parameter is set to True. If we would like to keep the column intact, we can change the value of the drop parameter to False.

It can be implemented by the following code:

In [10]: ind_df_inplace_intact.set_index('name', inplace=True, drop=False)

In [11]: ind_df_inplace_intact
Out[11]: 
                name      category prog_lang  exp  country
name                                                      
Chris          Chris    freelancer    Python    5  Germany
Priyatham  Priyatham    freelancer         C    2    India
Alice          Alice  fulltime_job    Python   15   France
Bob              Bob  fulltime_job         C   15      USA

From the above results, you can observe that the ind_df_inplace_intact DataFrame has name column present in normal columns and as the index.

Pandas reset_index()

Pandas reset_index() method resets the index of a Data Frame to a list of integers ranging from 0 to the length of the data. It takes an integer argument level and a string or a list to select and remove the passed column from the index.