Convert .xls/.xlsx spreadsheets to multiple .csv's based on a list
I need to convert all sheets of a single .xls/.xlsx file to a .csv. This will be done on all .xls files in all directories and sub-directories (recursively).
Step 1: Get the sheetnames of all .xls into a .csv using:
for file in $(find . -name '*.xls' -o -name '*.xlsx');do in2csv -n "$file" > ${file%.xls}-sheetnames-list.csv; done
filename-sheetnames-list.csv
can act as a list:
sheetname1
sheetname2
sheetname3
Step 2 : The code for converting a specific sheet into a .csv using in2csv is:
in2csv --sheet "SHEETNAME" filename.xls > filename-SHEETNAME.csv
How can I get every sheetname in a .xls/x and write every sheet separately for all directories containing a .xls/x ?
in2csv --write-sheets "-" filename.xls > filename-sheet1.csv filename-sheet2.csv ....
gives output only on sheet1.csv, not sure how to get all sheets from this.
Solution 1:
You can just put a loop inside another loop.
To avoid errors, don't use for
with find
results.
while IFS= read -r file; do
while IFS= read -r sheet; do
in2csv --sheet "$sheet" "$file" > "${file%.*}-${sheet}.csv"
done < <(in2csv -n "$file")
done < <(find . -name '*.xls' -o -name '*.xlsx')
Solution 2:
Skipping find and using bash:
shopt -s globstar # enable recursive globbing
for f in **/*.xls{,x} # for files ending in .xls or .xlsx
do
in2csv -n "$f" | # get the sheetnames
xargs -I {} bash -c 'in2csv --sheet "$2" "$1" > "${1%.*}"-"$2".csv' _ "$f" {} # {} will be replaced with the sheetname
done
Solution 3:
csvkit version > 1.0.2 has a builtin function to write all sheets:
--write-sheets: WRITE_SHEETS
The names of the Excel sheets to write to files, or
"-" to write all sheets.
So you could try the following:
find . -name '*.xls' -o -name '*.xlsx' -exec in2csv --write-sheets "-" {} \;
Note:
This seems not to work 100% as expected. But worth a try and as this is the first version with that option maybe in future versions the implementation is better/easier.