5 Best Ways to Concatenate Pandas Series in Python

πŸ’‘ Problem Formulation: When working with data in Python, it’s common to combine Series objects from the pandas library. You might have two or more Series objects with related data that you want to merge into a single Series for analysis or storage. For instance, if you have two Series containing monthly sales data from different quarters, you might want to concatenate them to get the sales data for the entire half-year. This article provides multiple solutions for concatenating Series objects efficiently.

Method 1: Using pd.concat()

One of the primary tools in pandas for concatenating Series is the pd.concat() function. It takes a list of Series and concatenates them along a particular axis (by default, axis=0, which denotes rows). This function is versatile, allowing for various customizations, such as ignoring the index or keeping hierarchical indexing by setting either the ignore_index or keys parameter.

Here’s an example:

import pandas as pd

series1 = pd.Series([1, 2, 3])
series2 = pd.Series([4, 5, 6])
concatenated_series = pd.concat([series1, series2])

print(concatenated_series)

Output:

0    1
1    2
2    3
0    4
1    5
2    6
dtype: int64

The pd.concat() function simply joins the two Series end-to-end, effectively creating a new Series with combined data from both sources. Notice that by default, the original indexes are preserved, which can result in duplicate index values.

Method 2: Using append() Method

The append() method of a pandas Series allows appending the contents of another Series or list to the end of the original Series. It’s a simple and intuitive way to concatenate Series, but unlike pd.concat(), the append() method is specifically for appending to a Series, not a generalized function that works with different pandas objects.

Here’s an example:

import pandas as pd

series1 = pd.Series([7, 8, 9])
series2 = pd.Series([10, 11, 12])
appended_series = series1.append(series2, ignore_index=True)

print(appended_series)

Output:

0     7
1     8
2     9
3    10
4    11
5    12
dtype: int64

In this snippet, the append() method is called on series1 with series2 as the argument. The ignore_index=True parameter creates a new default integer index for the resulting Series.

Method 3: Using the pd.Series() Constructor

A less direct, but still effective way to concatenate Series is to pass the combination of Series as a list to the pd.Series() constructor. This method creates a brand-new Series and offers maximum flexibility in terms of index handling since you can define your own index if desired.

Here’s an example:

import pandas as pd

series1 = pd.Series([13, 14, 15])
series2 = pd.Series([16, 17, 18])
new_series = pd.Series(list(series1) + list(series2))

print(new_series)

Output:

0    13
1    14
2    15
3    16
4    17
5    18
dtype: int64

By using list concatenation, we created a combined list which is then used to instantiate a new Series. This method is straightforward but might not be as efficient as concat() or append() when working with large datasets, as it requires an additional conversion to and from a list.

Method 4: Using Series.add() with the fill_value=0 Parameter

If you’re working with numeric data and want to combine Series by adding values together based on the index, you can use Series.add(). This is particularly useful for time series and other numeric computations. When series don’t have matching indexes, you can set the fill_value parameter to determine what value should be used for missing entries.

Here’s an example:

import pandas as pd

# Assuming both Series have the integer index automatically aligned
series1 = pd.Series([0, 1, 2], index=[0, 1, 2])
series2 = pd.Series([3, 4, 5], index=[1, 2, 3])
added_series = series1.add(series2, fill_value=0)

print(added_series)

Output:

0    0.0
1    4.0
2    6.0
3    5.0
dtype: float64

The add() method computes the addition of two Series, aligning them by index labels. The resulting Series contains the sum of values for common indexes and the values from the non-matching indexes. The fill_value=0 argument ensures that missing values are treated as 0s during addition.

Bonus One-Liner Method 5: Using List Comprehension to Concatenate Series

For those well-versed in Python’s list comprehensions, concatenating two Series can be done with a simple one-liner. This is a more Pythonic approach but is less transparent for those unfamiliar with list comprehensions and may be less efficient for larger datasets.

Here’s an example:

import pandas as pd

series1 = pd.Series([19, 20, 21])
series2 = pd.Series([22, 23, 24])
concat_series = pd.Series([item for s in [series1, series2] for item in s])

print(concat_series)

Output:

0    19
1    20
2    21
3    22
4    23
5    24
dtype: int64

This one-liner combines the two Series by first flattening them into a single list using a nested list comprehension, then creating a new Series from this list.

Summary/Discussion

  • Method 1: pd.concat(): General-purpose concatenation with flexibility in index handling. Efficient for large datasets. May result in duplicate index values if not managed.
  • Method 2: append(): User-friendly and intuitive. Specifically designed for Series, but lacks the multifunctionality of pd.concat(). Consumes more memory as it returns a new object.
  • Method 3: pd.Series() Constructor: Creates a new Series with complete control over the index. Requires additional list conversion, potentially less efficient with large data.
  • Method 4: Series.add(): Ideal for numerical computations requiring index alignment and addition. The resultant index is a union of the input indexes.
  • Bonus Method 5: List Comprehension: Pythonic, quick for small datasets, but less clear and possibly less efficient than built-in methods for large datasets.