W3C log analyzer [closed]

Solution 1:

I've used AWStats in a number of installations. Its nice... and free.

http://awstats.sourceforge.net

Here's a link (on their site) to a decent comparison vs some other popular competing products:

http://awstats.sourceforge.net/docs/awstats_compare.html

Solution 2:

I use

Webalizer and

AWStats.

Both will analyse the W3C Extended Log Format, both are licenced under the GNU General Public License (GPL). There are differences between the two in the format of displayed results and if you are producing stats for many sites some users may prefer one to the other.

I have also used analog in the past but its main advantage for me was raw speed and I don't now have log files big enough to make it worthwhile using a third analyser.

There are differences between the three in the format and extensiveness of displayed results and if you are producing statistics and graphs for many sites some users may prefer one to the other. Many web hosting companies provide a couple of these for online use by customers.

The platform you are using it on can be an issue. All three (and more!) are available in GNU/Linux distribution repositories, e.g. Ubuntu 9.04, and are usually ready to run when installed. It may require more work to get the one you want working on other platforms (e.g. I think awstats needs perl; webalizer comes from the author as C source code or as a Linux-x86 or solaris executable).

To make a choice between the many analysers available you need to decide which reports you need, which you would like and which you don't want and compare that list to what the various tools offer. Tailoring the output to what you want makes it easier to get out the information you need. You may want to consider several runs through the data to provide different reports, possibly using different tools. For example, a fast run through just to identify unexpected errors (missing files or graphics (404) referenced by other pages on the site and unexpected error codes) can be helpful to the site administrator. Data providers are less interested in those reports but may want to know which pages are most popular, search strings used and numbers of visitors. Network administrators may want to know average and peak total load and which pages generate most load so that they can ask when they have been optimised correctly. Eventually people start to ask questions that none of the tools answers well but experience with several different analysers may postpone that day for a while.

Not analysing the logs from the server but relevant to the area, Google provide Webmasters tools that give information about the site from Google's perspective, gained from site crawls. As well as showing the sites ranking within Google on certain search terms and which other sites link to yours it has other information such as which pages Google is not indexing (e.g. because of robots.txt restrictions) and which pages it cannot find. These are a useful adjunct to the log file analysis on the server when looking for errors and missing material on the site.