A software to analyze Apache log is needed [closed]
I have grep the Apache logs of my website for a specific period of time. I need a software to analyze it to see what are the most visitor IPs .
My OS is Ubuntu.
Would you suggest something?
Thank you.
There are a number of logfile analyzer packages that exist - two of which are Webalizer (written in C) and AWStats (written in Perl). Both should also be available through the Ubuntu repositories. These will parse your logfiles and generate reports for you to view (typically through a web browser, although, they can generate text reports as well).
Traditionally, they are setup to run automatically (e.g. via cron) and the latest report is available by accessing a particular path under your domain, and/or is emailed to you. (As you may expect, both do require some setup to produce what you want).
If you just want get a list of which IP addresses have made the most requests, a one-time bash command is likely easier than setting up either of the above. Try:
awk '{!a[$1]++}END{for(i in a) if ( a[i] >10 ) print a[i],i }' access.log | sort -n -r
Essentially, create an array using the IP address as the index, increasing the counter with each match - and display those results with more than 10 matches. Pipe the result through sort
to display in descending order.
If you have already used grep
to filter your list, you can pipe the output of your that command through (a slight variation of) the one above.
For instance, for a list of IP addresses that made more than 10 requests, today, between midnight and 4am (not inclusive) (presuming that data is in access.log):
grep '02/Apr/2012:0[0-3]' access.log | awk '{!a[$1]++}END{for(i in a) if ( a[i] >10 ) print a[i],i }' | sort -n -r
Sample output:
40 66.xxx.xx.xx
35 184.xx.xxx.xx
26 147.xx.xxx.xx
18 74.xxx.xx.xxx
17 209.xx.xxx.xxx
15 14.xxx.xxx.xx