Convert xlsx to csv in Linux with command line
I'm looking for a way to convert xlsx files to csv files on Linux.
I do not want to use PHP/Perl or anything like that since I'm looking at processing several millions of lines, so I need something quick. I found a program on the Ubuntu repos called xls2csv but it will only convert xls (Office 2003) files (which I'm currently using) but I need support for the newer Excel files.
Any ideas?
The Gnumeric spreadsheet application comes with a command line utility called ssconvert that can convert between a variety of spreadsheet formats:
$ ssconvert Book1.xlsx newfile.csv
Using exporter Gnumeric_stf:stf_csv
$ cat newfile.csv
Foo,Bar,Baz
1,2,3
123.6,7.89,
2012/05/14,,
The,last,Line
To install on Ubuntu:
apt-get install gnumeric
To install on Mac:
brew install gnumeric
You can do this with LibreOffice:
libreoffice --headless --convert-to csv $filename --outdir $outdir
For reasons not clear to me, you might need to run this with sudo. You can make LibreOffice work with sudo without requiring a password by adding this line to you sudoers file:
users ALL=(ALL) NOPASSWD: libreoffice
If you already have a Desktop environment then I'm sure Gnumeric / LibreOffice would work well, but on a headless server (such as Amazon Web Services), they require dozens of dependencies that you also need to install.
I found this Python alternative:
https://github.com/dilshod/xlsx2csv
$ easy_install xlsx2csv
$ xlsx2csv file.xlsx > newfile.csv
Took 2 seconds to install and works like a charm.
If you have multiple sheets you can export all at once, or one at a time:
$ xlsx2csv file.xlsx --all > all.csv
$ xlsx2csv file.xlsx --all -p '' > all-no-delimiter.csv
$ xlsx2csv file.xlsx -s 1 > sheet1.csv
He also links to several alternatives built in Bash, Python, Ruby, and Java.