A command-line HTML pretty-printer: Making messy HTML readable [closed]

I'm looking for recommendations for HTML pretty printers which fulfill the following requirements:

  • Takes HTML as input, and then output a nicely formatted/correctly indented but "graphically equivalent" version of the given input HTML.
  • Must support command-line operation.
  • Must be open-source and run under Linux.

Solution 1:

Have a look at the HTML Tidy Project: http://www.html-tidy.org/

The granddaddy of HTML tools, with support for modern standards.

There used to be a fork called tidy-html5 which since became the official thing. Here is its GitHub repository.

Tidy is a console application for Mac OS X, Linux, Windows, UNIX, and more. It corrects and cleans up HTML and XML documents by fixing markup errors and upgrading legacy code to modern standards.

For your needs, here is the command line to call Tidy:

tidy inputfile.html

Solution 2:

Update 2018: The homebrew/dupes is now deprecated, tidy-html5 may be directly installed.

brew install tidy-html5

Original reply:

Tidy from OS X doesn't support HTML5. But there is experimental branch on Github which does.

To get it:

 brew tap homebrew/dupes
 brew install tidy --HEAD
 brew untap homebrew/dupes

That's it! Have fun!