Generate a single PDF from a websites HTML pages

Here's the problem: There's a website that I often need for reference and I would like an offline version that also works in mobile devices, a PDF comes to mind.

I can make an offline copy of the HTML version with wget, that's not a the problem.

What I'd really like is a way to transform all the HTML pages into a single PDF with the internal links still working. So, a link that would have referred to another URL on the web version should then refer to the corresponding page in the PDF.

Ideally there should also be a way to generate a table of content to put into the PDF.

How can I achieve this?

Bash/Python/ruby/whatever scripts and other command line stuff are welcome, too.

(I'm on OSX 10.9 by the way.)


You should take a look at wkhtmltopdf, a free tool which, judging from the list of advanced features advertised in its manual page, should at least roughly satisfy you:

Printing more then one HTML document into a PDF file.

Running without an X11 server.

Adding a document outline to the PDF file.

Adding headers and footers to the PDF file.

Generating a table of contents.

Adding links in the generated PDF file.

Printing using the screen media-type.

Disabling the smart shrink feature of webkit.