HTML minification? [closed]

Solution 1:

Perhaps try HTML Compressor, here's a before and after table showing what it can do (including for Stack Overflow itself):

Sorry, markdown has no concept of tables

It features many selections for optimizing your pages up to and including script minimizing (ompressor, Google Closure Compiler, your own compressor) where it would be safe. The default option set is quite conservative, so you can start with that and experiment with enabling more aggressive options.

The project is extremely well documented and supported.

Solution 2:

Don't do this. Or rather, if you insist on it, do it after any more significant site optimizations are complete. Chances are very high that the cost/benefit for this effort is negligible, especially if you were planning to manually use online tools to deal with each page.

Use YSlow or Page Speed to determine what you really need to do to optimize your pages. My guess is that reducing bytes of HTML will not be your site's biggest problem. It's much more likely that compression, cache management, image optimization, etc will make a bigger difference to the performance of your site overall. Those tools will show you what the biggest problems are -- if you've dealt with them all and still find that HTML minification makes a significant difference, go for it.

(If you're sure you want to go for it, and you use Apache httpd, you might consider using mod_pagespeed and turning on some of the options to reduce whitespace, etc., but be aware of the risks.)

Solution 3:

Here is a short answer to your question: you should minify your HTML, CSS, JS. There is an easy to use tool which is called grunt. It allows you to automate a lot of tasks. Among them JS, CSS, HTML minification, file concatenation and many others.

The answers written here are extremely outdated or even sometimes does not make sense. A lot of things changed from old 2009, so I will try to answer this properly.

Short answer - you should definitely minify HTML. It is trivial today and gives approximately 5% speedup. For longer answer read the whole answer

Back in old days people were manually minifying css/js (by running it through some specific tool to minify it). It was kind of hard to automate the process and definitely required some skills. Knowing that a lot of high level sites even right now are not using gzip (which is trivial), it is understandable that people were reluctant in minifying html.

So why were people minifying js, but not html? When you minify JS, you do the following things:

  • remove comments
  • remove blanks (tabs, spaces, newlines)
  • change long names to short (var isUserLoggedIn to var a)

Which gave a lot of improvement even at old days. But in html you were not able to change long names for short, also there was almost nothing to comment during that time. So the only thing that was left is to remove spaces and newlines. Which gives only small amount of improvement.

One wrong argument written here is that because content is served with gzip, minification does not make sense. This is totally wrong. Yes, it makes sense that gzip decrease the improvement of minification, but why should you gzip comments, whitespaces if you can properly trim them and gzip only important part. It is the same as if you have a folder to archive which has some crap that you will never use and you decide to just zip it instead of cleaning up and zip it.

Another argument why it pointless to do minification is that it is tedious. Maybe this was true in 2009, but new tools appeared after this time. Right now you do not need to manually minify your markup. With things like Grunt it is trivial to install grunt-contrib-htmlmin (relies on HTMLMinifier by @kangax) and to configure it to minify your html. All you need is like 2 hours to learn grunt and to configure everything and then everything is done automatically in less than a second. Sounds that 1 second (which you can even automate to do nothing with grunt-contrib-watch) is not really so bad for approximately 5% of improvement (even with gzip).

One more argument is that CSS and JS are static, and HTML is generated by the server so you can not pre-minify it. This was also true in 2009, but currently more and more sites are looking like a single page app, where the server is thin and the client is doing all the routing, templating and other logic. So the server is only giving you JSON and client renders it. Here you have a lot of html for the page and different templates.

So to finish my thoughts:

  • google is minifying html.
  • pageSpeed is asking your to minify html
  • it is trivial to do
  • it gives ~5% of improvement
  • it is not the same as gzip

Solution 4:

I wrote a web tool to minify HTML. http://prettydiff.com/?m=minify&html

This tool operates using these rules:

  • All HTML comments are removed
  • Runs of white space characters are converted to single space characters
  • Unnecessary white space characters inside tags are removed
  • White space characters between two tags where one of these two tags is not a singleton is removed
  • All content inside a style tag is presumed to be CSS and is minified as such
  • All content inside a script tag is presumed to be JavaScript, unless provided a different media type, and then minified as such
    • The CSS and JavaScript minification uses a heavily forked form of JSMin. This fork is extended to support CSS natively and also support SCSS syntax. Automatic semicolon insertion is supported for JavaScript minification, however automatic curly brace insertion is not yet supported.

    Solution 5:

    This worked for me:

    http://minify.googlecode.com/git/min/lib/Minify/HTML.php

    It's not an already available online tool, but being a simple PHP include it's easy enough you can just run it yourself.

    I would not save compressed files though, do this dynamically if you really have to, and it's always a better idea to enable Gzip server compression. I don't know how involved that is in IIS/.Net, but in PHP it's as trivial as adding one line to the global include file