pandas three-way joining multiple dataframes on columns
Solution 1:
Assumed imports:
import pandas as pd
John Galt's answer is basically a reduce
operation. If I have more than a handful of dataframes, I'd put them in a list like this (generated via list comprehensions or loops or whatnot):
dfs = [df0, df1, df2, dfN]
Assuming they have some common column, like name
in your example, I'd do the following:
df_final = reduce(lambda left,right: pd.merge(left,right,on='name'), dfs)
That way, your code should work with whatever number of dataframes you want to merge.
Edit August 1, 2016: For those using Python 3: reduce
has been moved into functools
. So to use this function, you'll first need to import that module:
from functools import reduce
Solution 2:
You could try this if you have 3 dataframes
# Merge multiple dataframes
df1 = pd.DataFrame(np.array([
['a', 5, 9],
['b', 4, 61],
['c', 24, 9]]),
columns=['name', 'attr11', 'attr12'])
df2 = pd.DataFrame(np.array([
['a', 5, 19],
['b', 14, 16],
['c', 4, 9]]),
columns=['name', 'attr21', 'attr22'])
df3 = pd.DataFrame(np.array([
['a', 15, 49],
['b', 4, 36],
['c', 14, 9]]),
columns=['name', 'attr31', 'attr32'])
pd.merge(pd.merge(df1,df2,on='name'),df3,on='name')
alternatively, as mentioned by cwharland
df1.merge(df2,on='name').merge(df3,on='name')