5 Best Ways to Insert a Row into a Pandas DataFrame

💡 Problem Formulation: When working with data in Python, a common task is to add new observations to an existing DataFrame. This operation can be necessary for data preprocessing or during data collection when new records arrive. Suppose we have a DataFrame representing a classroom of students and we want to add a new student’s information as a row. The input is a DataFrame with columns ‘Name’, ‘Age’, and ‘Grade’, and the desired output is the same DataFrame with the new student’s data included.

Method 1: Using `loc[]`

Using loc[] allows you to add a new row to a DataFrame by specifying the index of the new row and assigning a list, Series, or dictionary of values. This method changes the DataFrame in place and is straightforward when you know the index at which the new row should be inserted.

Here’s an example:

import pandas as pd

# Existing DataFrame
df = pd.DataFrame({
    'Name': ['Alice', 'Bob'],
    'Age': [25, 30],
    'Grade': ['A', 'B']
})

# New row data
new_student = {'Name': 'Charlie', 'Age': 27, 'Grade': 'A+'}

# Insert new row at index 2
df.loc[2] = new_student
print(df)

The output of this code snippet will be:

      Name  Age Grade
0    Alice   25     A
1      Bob   30     B
2  Charlie   27    A+

This code first creates a DataFrame with two existing students. Then, it adds a new student ‘Charlie’ to the DataFrame by assigning the values to the next available index using df.loc[2]. The new row is inserted at the bottom of the DataFrame.

Method 2: Using `append()`

The append() method is useful for adding a single row or multiple rows to a DataFrame. You can pass a Series, DataFrame, or dictionary to the append() method. This method returns a new DataFrame with the rows appended and doesn’t modify the original DataFrame unless instructed with inplace=True.

Here’s an example:

# New row as a Series
new_student_series = pd.Series(['Charlie', 27, 'A+'], index=df.columns)

# Append the new row to the DataFrame
df = df.append(new_student_series, ignore_index=True)
print(df)

The output of this code snippet will be:

      Name  Age Grade
0    Alice   25     A
1      Bob   30     B
2  Charlie   27    A+

In this example, a new student’s data is created as a Series with the same index as the DataFrame’s columns. The append() method is then used to add this row to the DataFrame, and the parameter ignore_index=True ensures the DataFrame’s index is maintained correctly.

Method 3: Using `DataFrame.loc[]` with a Non-Existent Index

A quick way to add a row in a DataFrame is by using loc[] with an index that doesn’t yet exist. If you assign a new row to a non-existent index, pandas will automatically create that index and the corresponding row.

Here’s an example:

# Insert new row with non-existent index
df.loc[len(df)] = ['Diana', 28, 'B+']
print(df)

The output of this code snippet will be:

      Name  Age Grade
0    Alice   25     A
1      Bob   30     B
2  Charlie   27    A+
3    Diana   28    B+

This code snippet adds a new row to the DataFrame by using an index equal to the current length of the DataFrame, which is guaranteed to not exist. This effectively appends the new row at the end.

Method 4: Using `concat()`

The concat() function in pandas is primarily used for concatenating two or more DataFrames along rows or columns. You can use this function to add a new row by creating a new DataFrame with just the row data and concatenating it with the original DataFrame.

Here’s an example:

# New row as a DataFrame
new_row_df = pd.DataFrame([['Eve', 29, 'A']], columns=df.columns)

# Concatenate the new row DataFrame with the existing DataFrame
df = pd.concat([df, new_row_df], ignore_index=True)
print(df)

The output of this code snippet will be:

      Name  Age Grade
0    Alice   25     A
1      Bob   30     B
2  Charlie   27    A+
3    Diana   28    B+
4      Eve   29     A

This example demonstrates how to add a new row by creating a DataFrame of the new row and then using pd.concat() to append it to the original DataFrame, using the parameter ignore_index=True to reset the index.

Bonus One-Liner Method 5: Using `DataFrame.iloc[]`

The iloc[] function is typically used for indexed location-based indexing, which means that you can insert a row at any position within the DataFrame. This method could be combined with slicing to insert a row into a specific place. However, be cautious as this method might not be as straightforward as others for inserting rows.

Here’s an example:

# Row to insert, using the length of the DataFrame ensures it's added to the end
df.iloc[len(df)] = ['Frank', 26, 'C']
print(df)

The output of this code snippet will be:

      Name  Age Grade
0    Alice   25     A
1      Bob   30     B
2  Charlie   27    A+
3    Diana   28    B+
4      Eve   29     A
5    Frank   26     C

This one-liner essentially duplicates the functionality of the loc[] method shown in Method 3. It adds a new row to the DataFrame by using the indexed location-based indexing function iloc[].

Summary/Discussion

Method 1: Using loc[]. Very intuitive for adding rows by index. It does change the DataFrame in place, which could be a drawback when an original DataFrame needs to be preserved.
Method 2: Using append(). Flexible and capable of appending a single row or multiple rows. It returns a new DataFrame, meaning the original is not modified unless specified. Can be less efficient for large DataFrames.
Method 3: Using loc[] with a Non-Existent Index. A convenient shorthand for appending rows to the bottom of a DataFrame, it modifies the DataFrame in place.
Method 4: Using concat(). Ideal for adding multiple rows and concating DataFrames vertically or horizontally. However, it can be more verbose for just adding a single row.
Bonus One-Liner Method 5: Using iloc[]. Similar to Method 3, it’s a straightforward way to insert a row at the end of a DataFrame. It lacks the clarity of specifying an index like loc[].

Method 1: Using loc[]

Method 2: Using append()

Method 3: Using DataFrame.loc[] with a Non-Existent Index

Method 4: Using concat()

Bonus One-Liner Method 5: Using DataFrame.iloc[]

Summary/Discussion

Method 1: Using `loc[]`

Method 2: Using `append()`

Method 3: Using `DataFrame.loc[]` with a Non-Existent Index

Method 4: Using `concat()`

Bonus One-Liner Method 5: Using `DataFrame.iloc[]`