Reading/parsing Excel (xls) files with Python [closed]
What is the best way to read Excel (XLS) files with Python (not CSV files).
Is there a built-in package which is supported by default in Python to do this task?
Solution 1:
I highly recommend xlrd for reading .xls
files. But there are some limitations(refer to xlrd github page):
Warning
This library will no longer read anything other than .xls files. For alternatives that read newer file formats, please see http://www.python-excel.org/.
The following are also not supported but will safely and reliably be ignored:
- Charts, Macros, Pictures, any other embedded object, including embedded worksheets. - VBA modules - Formulas, but results of formula calculations are extracted. - Comments - Hyperlinks - Autofilters, advanced filters, pivot tables, conditional formatting, data validation
Password-protected files are not supported and cannot be read by this library.
voyager mentioned the use of COM automation. Having done this myself a few years ago, be warned that doing this is a real PITA. The number of caveats is huge and the documentation is lacking and annoying. I ran into many weird bugs and gotchas, some of which took many hours to figure out.
UPDATE: For newer .xlsx
files, the recommended library for reading and writing appears to be openpyxl (thanks, Ikar Pohorský).
Solution 2:
Using pandas:
import pandas as pd
xls = pd.ExcelFile(r"yourfilename.xls") #use r before absolute file path
sheetX = xls.parse(2) #2 is the sheet number+1 thus if the file has only 1 sheet write 0 in paranthesis
var1 = sheetX['ColumnName']
print(var1[1]) #1 is the row number...
Solution 3:
You can choose any one of them http://www.python-excel.org/
I would recommended python xlrd library.
install it using
pip install xlrd
import using
import xlrd
to open a workbook
workbook = xlrd.open_workbook('your_file_name.xlsx')
open sheet by name
worksheet = workbook.sheet_by_name('Name of the Sheet')
open sheet by index
worksheet = workbook.sheet_by_index(0)
read cell value
worksheet.cell(0, 0).value