Concatenate rows of two dataframes in pandas
I need to concatenate two dataframes df_a
anddf_b
having equal number of rows (nRow
) one after another without any consideration of keys. This function is similar to cbind
in R programming language
. The number of columns in each dataframe may be different.
The resultant dataframe will have the same number of rows nRow
and number of columns equal to the sum of number of columns in both the dataframes. In othe words, this is a blind columnar concatenation of two dataframes.
import pandas as pd
dict_data = {'Treatment': ['C', 'C', 'C'], 'Biorep': ['A', 'A', 'A'], 'Techrep': [1, 1, 1], 'AAseq': ['ELVISLIVES', 'ELVISLIVES', 'ELVISLIVES'], 'mz':[500.0, 500.5, 501.0]}
df_a = pd.DataFrame(dict_data)
dict_data = {'Treatment1': ['C', 'C', 'C'], 'Biorep1': ['A', 'A', 'A'], 'Techrep1': [1, 1, 1], 'AAseq1': ['ELVISLIVES', 'ELVISLIVES', 'ELVISLIVES'], 'inte1':[1100.0, 1050.0, 1010.0]}
df_b = pd.DataFrame(dict_data)
call concat
and pass param axis=1
to concatenate column-wise:
In [5]:
pd.concat([df_a,df_b], axis=1)
Out[5]:
AAseq Biorep Techrep Treatment mz AAseq1 Biorep1 Techrep1 \
0 ELVISLIVES A 1 C 500.0 ELVISLIVES A 1
1 ELVISLIVES A 1 C 500.5 ELVISLIVES A 1
2 ELVISLIVES A 1 C 501.0 ELVISLIVES A 1
Treatment1 inte1
0 C 1100
1 C 1050
2 C 1010
There is a useful guide to the various methods of merging, joining and concatenating online.
For example, as you have no clashing columns you can merge
and use the indices as they have the same number of rows:
In [6]:
df_a.merge(df_b, left_index=True, right_index=True)
Out[6]:
AAseq Biorep Techrep Treatment mz AAseq1 Biorep1 Techrep1 \
0 ELVISLIVES A 1 C 500.0 ELVISLIVES A 1
1 ELVISLIVES A 1 C 500.5 ELVISLIVES A 1
2 ELVISLIVES A 1 C 501.0 ELVISLIVES A 1
Treatment1 inte1
0 C 1100
1 C 1050
2 C 1010
And for the same reasons as above a simple join
works too:
In [7]:
df_a.join(df_b)
Out[7]:
AAseq Biorep Techrep Treatment mz AAseq1 Biorep1 Techrep1 \
0 ELVISLIVES A 1 C 500.0 ELVISLIVES A 1
1 ELVISLIVES A 1 C 500.5 ELVISLIVES A 1
2 ELVISLIVES A 1 C 501.0 ELVISLIVES A 1
Treatment1 inte1
0 C 1100
1 C 1050
2 C 1010
Thanks to @EdChum I was struggling with same problem especially when indexes do not match. Unfortunatly in pandas guide this case is missed (when you for example delete some rows)
import pandas as pd
t=pd.DataFrame()
t['a']=[1,2,3,4]
t=t.loc[t['a']>1] #now index starts from 1
u=pd.DataFrame()
u['b']=[1,2,3] #index starts from 0
#option 1
#keep index of t
u.index = t.index
#option 2
#index of t starts from 0
t.reset_index(drop=True, inplace=True)
#now concat will keep number of rows
r=pd.concat([t,u], axis=1)