Print chosen worksheets in excel files to pdf in python
I need to write a python script to read excel files, find each worksheet and then print these to pdf with the standard formating defined in the excel.
I found the following question How can I open an Excel file in Python? which pointed me to http://www.python-excel.org/
This gives me the ability to find the names of each worksheet.
import xlrd
book = xlrd.open_workbook("myfile.xls")
print "Worksheet name(s):", book.sheet_names()
This results in
Worksheet name(s): [u'Form 5', u'Form 3', u'988172 Adams Road', u'379562 Adams Road', u'32380 Adams Road', u'676422 Alderman Road', u'819631 Appleyard Road', u'280998 Appleyard Road', u'781656 Atkinson Road', u'949461 Barretts Lagoon Road', u'735284 Bilyana Road', u'674784 Bilyana Road', u'490894 Blackman Road', u'721026 Blackman Road']
Now I want to print each worksheet which starts with a number to a pdf.
So I can
worksheetList=book.sheet_names()
for worksheet in worksheetList:
if worksheet.find('Form')!=0: #this just leaves out worksheets with the word 'form' in it
<function to print to pdf> book.sheet_by_name(worksheet) #what can I use for this?
or something similar to above...what can I use to achieve this?
The XLRD documentation is confusing it says
Formatting features not included in xlrd version 0.6.1: Miscellaneous sheet-level and book-level items e.g. printing layout, screen panes
and
Formatting
Introduction
This collection of features, new in xlrd version 0.6.1, is intended to provide the information needed to (1) display/render spreadsheet contents (say) on a screen or in a PDF file
see https://secure.simplistix.co.uk/svn/xlrd/trunk/xlrd/doc/xlrd.html?p=4966
Which is true? can some other package be used to print to pdf?
For unix I see that there is http://dag.wieers.com/home-made/unoconv/ anything for windows? I found https://gist.github.com/mprihoda/2891437 but can't figure out how to use it yet.
This seems like the place to put this answer.
In the simplest form:
import win32com.client
o = win32com.client.Dispatch("Excel.Application")
o.Visible = False
wb_path = r'c:\user\desktop\sample.xls'
wb = o.Workbooks.Open(wb_path)
ws_index_list = [1,4,5] #say you want to print these sheets
path_to_pdf = r'C:\user\desktop\sample.pdf'
wb.WorkSheets(ws_index_list).Select()
wb.ActiveSheet.ExportAsFixedFormat(0, path_to_pdf)
Including a little formatting magic that scales to fit to a single page and sets the print area:
import win32com.client
o = win32com.client.Dispatch("Excel.Application")
o.Visible = False
wb_path = r'c:\user\desktop\sample.xls'
wb = o.Workbooks.Open(wb_path)
ws_index_list = [1,4,5] #say you want to print these sheets
path_to_pdf = r'C:\user\desktop\sample.pdf'
print_area = 'A1:G50'
for index in ws_index_list:
#off-by-one so the user can start numbering the worksheets at 1
ws = wb.Worksheets[index - 1]
ws.PageSetup.Zoom = False
ws.PageSetup.FitToPagesTall = 1
ws.PageSetup.FitToPagesWide = 1
ws.PageSetup.PrintArea = print_area
wb.WorkSheets(ws_index_list).Select()
wb.ActiveSheet.ExportAsFixedFormat(0, path_to_pdf)
I have also started a module over a github if you want to look at that: https://github.com/spottedzebra/excel/blob/master/excel_to_pdf.py