Appending pandas dataframes generated in a for loop
I am accessing a series of Excel files in a for loop. I then read the data in the excel file to a pandas dataframe. I cant figure out how to append these dataframes together to then save the dataframe (now containing the data from all the files) as a new Excel file.
Here's what I tried:
for infile in glob.glob("*.xlsx"):
data = pandas.read_excel(infile)
appended_data = pandas.DataFrame.append(data) # requires at least two arguments
appended_data.to_excel("appended.xlsx")
Thanks!
Solution 1:
Use pd.concat
to merge a list of DataFrame into a single big DataFrame.
appended_data = []
for infile in glob.glob("*.xlsx"):
data = pandas.read_excel(infile)
# store DataFrame in list
appended_data.append(data)
# see pd.concat documentation for more info
appended_data = pd.concat(appended_data)
# write DataFrame to an excel sheet
appended_data.to_excel('appended.xlsx')
Solution 2:
you can try this.
data_you_need=pd.DataFrame()
for infile in glob.glob("*.xlsx"):
data = pandas.read_excel(infile)
data_you_need=data_you_need.append(data,ignore_index=True)
I hope it can help.