5 Best Ways to Add a Row to a Pandas DataFrame

πŸ’‘ Problem Formulation:

When working with tabular data in Python, data analysts and scientists often need to add new data to an existing Pandas DataFrame. Suppose you have a DataFrame representing daily sales data, and you want to add a new row for today’s sales. The input is your existing DataFrame and the desired output is the DataFrame with the new row appended at the bottom.

Method 1: Using DataFrame.append()

The DataFrame.append() method in Pandas is used to append rows to a DataFrame. The method takes a single row or another DataFrame and returns a new DataFrame with the data appended. It’s important to note that the original DataFrame remains unchanged, as this method returns a new DataFrame.

Here’s an example:

import pandas as pd

df = pd.DataFrame({
    'Date': ['2023-03-01', '2023-03-02'],
    'Sales': [200, 240]
})
new_row = {'Date': '2023-03-03', 'Sales': 260}
df_appended = df.append(new_row, ignore_index=True)

print(df_appended)

The output of the code snippet will be:

         Date  Sales
0  2023-03-01    200
1  2023-03-02    240
2  2023-03-03    260

Here, we see a new row added to the DataFrame df. The ignore_index=True argument is used to re-index the DataFrame. Without it, the new row would take an index from the input dictionary if provided.

Method 2: Using DataFrame.loc[]

The DataFrame.loc[] method can be utilized to add a row to a DataFrame by specifying a new index. This method modifies the original DataFrame in place, which can be more memory efficient.

Here’s an example:

import pandas as pd

df = pd.DataFrame({
    'Date': ['2023-03-01', '2023-03-02'],
    'Sales': [200, 240]
})
df.loc[len(df)] = ['2023-03-03', 260]

print(df)

The output of the code snippet will be:

         Date  Sales
0  2023-03-01    200
1  2023-03-02    240
2  2023-03-03    260

This approach directly adds a new row at the end of the DataFrame by specifying the index as the length of the DataFrame which points to the next available (new) row index.

Method 3: Using DataFrame.concat()

The DataFrame.concat() function is used to concatenate pandas objects along a particular axis. To add a row, you can concatenate your DataFrame with another DataFrame or Series that represents the new row.

Here’s an example:

import pandas as pd

df = pd.DataFrame({
    'Date': ['2023-03-01', '2023-03-02'],
    'Sales': [200, 240]
})
new_row_df = pd.DataFrame([['2023-03-03', 260]], columns=['Date', 'Sales'])
df_concated = pd.concat([df, new_row_df], ignore_index=True)

print(df_concated)

The output of the code snippet will be:

         Date  Sales
0  2023-03-01    200
1  2023-03-02    240
2  2023-03-03    260

This method is useful when adding multiple rows or when the new data is already formatted as a DataFrame or Series. ignore_index=True is set to reassign the index.

Method 4: Using a dict and DataFrame.iloc[]

The DataFrame.iloc[] method allows for integer-location based indexing for selection by position. By combining it with a dict, you can add a new row by its position in the DataFrame.

Here’s an example:

import pandas as pd

df = pd.DataFrame({
    'Date': ['2023-03-01', '2023-03-02'],
    'Sales': [200, 240]
})
df.iloc[len(df)] = pd.Series({'Date': '2023-03-03', 'Sales': 260})

print(df)

The output of the code snippet will be:

         Date  Sales
0  2023-03-01    200
1  2023-03-02    240
2  2023-03-03    260

This method adds a new Series as a row to the DataFrame at the specified index location, which in this case is determined dynamically using len(df).

Bonus One-Liner Method 5: Using List Append

A quick and memory-efficient way to add a row to a DataFrame is by appending a list directly to the DataFrame’s values. This method requires knowledge of the DataFrame’s columns order.

Here’s an example:

import pandas as pd

df = pd.DataFrame({
    'Date': ['2023-03-01', '2023-03-02'],
    'Sales': [200, 240]
})
df.loc[len(df)] = ['2023-03-03', 260]

print(df)

The output of the code snippet will be the same as in previous methods:

         Date  Sales
0  2023-03-01    200
1  2023-03-02    240
2  2023-03-03    260

By treating df.loc[len(df)] as a list, this method takes advantage of the DataFrame’s structure to add a new row. It’s an elegant, though somewhat less explicit, way of appending data.

Summary/Discussion

  • Method 1: DataFrame.append(). Non-destructive. Requires creating a new DataFrame or dict. May be less efficient with very large DataFrames.
  • Method 2: DataFrame.loc[]. In-place. Quick and straightforward, but less explicit than Method 1.
  • Method 3: DataFrame.concat(). Flexible for multiple rows. Ideal for combining multiple DataFrames but more involved for single rows.
  • Method 4: DataFrame.iloc[] with dict. Position-based addition. Works well when you have Series objects to add as rows.
  • Method 5: List Append. A single line of code. Efficient, but requires precise knowledge of the data structure and columns order.