Python Join List of DataFrames

Rate this post

To join a list of DataFrames, say dfs, use the pandas.concat(dfs) function that merges an arbitrary number of DataFrames to a single one.

When browsing StackOverflow, I recently stumbled upon the following interesting problem. By thinking about solutions to those small data science problems, you can improve your data science skills, so let’s dive into the problem description.

Problem: Given a list of Pandas DataFrames. How to merge them into a single DataFrame?

Example: You have the list of Pandas DataFrames:

df1 = pd.DataFrame({'Alice' : [18, 'scientist', 24000], 'Bob' : [24, 'student', 12000]})
df2 = pd.DataFrame({'Alice' : [19, 'scientist', 25000], 'Bob' : [25, 'student', 11000]})
df3 = pd.DataFrame({'Alice' : [20, 'scientist', 26000], 'Bob' : [26, 'student', 10000]})

# List of DataFrames
dfs = [df1, df2, df3]

Say, you want to get the following DataFrame:

       Alice      Bob
0         18       24
1  scientist  student
2      24000    12000
0         19       25
1  scientist  student
2      25000    11000
0         20       26
1  scientist  student
2      26000    10000

You can try the solution quickly in our interactive Python shell:

Exercise: Print the resulting DataFrame. Run the code. Which merging strategy is used?

Method 1: Pandas Concat

This is the easiest and most straightforward way to concatenate multiple DataFrames.

import pandas as pd

df1 = pd.DataFrame({'Alice' : [18, 'scientist', 24000], 'Bob' : [24, 'student', 12000]})
df2 = pd.DataFrame({'Alice' : [19, 'scientist', 25000], 'Bob' : [25, 'student', 11000]})
df3 = pd.DataFrame({'Alice' : [20, 'scientist', 26000], 'Bob' : [26, 'student', 10000]})

# list of dataframes
dfs = [df1, df2, df3]

df = pd.concat(dfs)

This generates the following output:

print(df)
'''
       Alice      Bob
0         18       24
1  scientist  student
2      24000    12000
0         19       25
1  scientist  student
2      25000    11000
0         20       26
1  scientist  student
2      26000    10000
'''

The resulting DataFrames contains all original data from all three DataFrames.

Method 2: Reduce + DataFrame Merge

The following method uses the reduce function to repeatedly merge together all dictionaries in the list (no matter its size). To merge two dictionaries, the df.merge() method is used. You can use several merging strategies—in the example, you use "outer":

import pandas as pd

df1 = pd.DataFrame({'Alice' : [18, 'scientist', 24000], 'Bob' : [24, 'student', 12000]})
df2 = pd.DataFrame({'Alice' : [19, 'scientist', 25000], 'Bob' : [25, 'student', 11000]})
df3 = pd.DataFrame({'Alice' : [20, 'scientist', 26000], 'Bob' : [26, 'student', 10000]})

# list of dataframes
dfs = [df1, df2, df3]

# Method 2
from functools import reduce
df = reduce(lambda df1, df2: df1.merge(df2, "outer"), dfs)

This generates the following output:

print(df)
'''
       Alice      Bob
0         18       24
1  scientist  student
2      24000    12000
3         19       25
4      25000    11000
5         20       26
6      26000    10000
'''

You can find a discussion of the different merge strategies here. If you’d use the parameter "inner", you’d obtain the following result:

       Alice      Bob
0  scientist  student

Where to Go From Here?

Enough theory. Letโ€™s get some practice!

Coders get paid six figures and more because they can solve problems more effectively using machine intelligence and automation.

To become more successful in coding, solve more real problems for real people. Thatโ€™s how you polish the skills you really need in practice. After all, whatโ€™s the use of learning theory that nobody ever needs?

You build high-value coding skills by working on practical coding projects!

Do you want to stop learning with toy projects and focus on practical code projects that earn you money and solve real problems for people?

๐Ÿš€ If your answer is YES!, consider becoming a Python freelance developer! Itโ€™s the best way of approaching the task of improving your Python skillsโ€”even if you are a complete beginner.

If you just want to learn about the freelancing opportunity, feel free to watch my free webinar โ€œHow to Build Your High-Income Skill Pythonโ€ and learn how I grew my coding business online and how you can, tooโ€”from the comfort of your own home.

Join the free webinar now!