xls to csv converter

Solution 1:

I would use xlrd - it's faster, cross platform and works directly with the file.

As of version 0.8.0, xlrd reads both XLS and XLSX files.

But as of version 2.0.0, support was reduced back to only XLS.

import xlrd
import csv

def csv_from_excel():
    wb = xlrd.open_workbook('your_workbook.xls')
    sh = wb.sheet_by_name('Sheet1')
    your_csv_file = open('your_csv_file.csv', 'wb')
    wr = csv.writer(your_csv_file, quoting=csv.QUOTE_ALL)

    for rownum in xrange(sh.nrows):
        wr.writerow(sh.row_values(rownum))

    your_csv_file.close()

Solution 2:

I would use pandas. The computationally heavy parts are written in cython or c-extensions to speed up the process and the syntax is very clean. For example, if you want to turn "Sheet1" from the file "your_workbook.xls" into the file "your_csv.csv", you just use the top-level function read_excel and the method to_csv from the DataFrame class as follows:

import pandas as pd
data_xls = pd.read_excel('your_workbook.xls', 'Sheet1', index_col=None)
data_xls.to_csv('your_csv.csv', encoding='utf-8')

Setting encoding='utf-8' alleviates the UnicodeEncodeError mentioned in other answers.

Solution 3:

Maybe someone find this ready-to-use piece of code useful. It allows to create CSVs from all spreadsheets in Excel's workbook.

enter image description here

Python 2:

# -*- coding: utf-8 -*-
import xlrd
import csv
from os import sys
 
def csv_from_excel(excel_file):
    workbook = xlrd.open_workbook(excel_file)
    all_worksheets = workbook.sheet_names()
    for worksheet_name in all_worksheets:
        worksheet = workbook.sheet_by_name(worksheet_name)
        with open(u'{}.csv'.format(worksheet_name), 'wb') as your_csv_file:
            wr = csv.writer(your_csv_file, quoting=csv.QUOTE_ALL)
            for rownum in xrange(worksheet.nrows):
                wr.writerow([unicode(entry).encode("utf-8") for entry in worksheet.row_values(rownum)])

if __name__ == "__main__":
    csv_from_excel(sys.argv[1])

Python 3:

import xlrd
import csv
from os import sys

def csv_from_excel(excel_file):
    workbook = xlrd.open_workbook(excel_file)
    all_worksheets = workbook.sheet_names()
    for worksheet_name in all_worksheets:
        worksheet = workbook.sheet_by_name(worksheet_name)
        with open(u'{}.csv'.format(worksheet_name), 'w', encoding="utf-8") as your_csv_file:
            wr = csv.writer(your_csv_file, quoting=csv.QUOTE_ALL)
            for rownum in range(worksheet.nrows):
                wr.writerow(worksheet.row_values(rownum))

if __name__ == "__main__":
    csv_from_excel(sys.argv[1])

Solution 4:

I'd use csvkit, which uses xlrd (for xls) and openpyxl (for xlsx) to convert just about any tabular data to csv.

Once installed, with its dependencies, it's a matter of:

python in2csv myfile > myoutput.csv

It takes care of all the format detection issues, so you can pass it just about any tabular data source. It's cross-platform too (no win32 dependency).

xls to csv converter

Solution 1:

Solution 2:

Solution 3:

Solution 4:

Related

Recent Posts