Get list of user-agents from nginx log
Solution 1:
awk -F'"' '/GET/ {print $6}' /var/log/nginx-access.log | cut -d' ' -f1 | sort | uniq -c | sort -rn
-
awk(1)
- selecting full User-Agent string of GET requests -
cut(1)
- using first word from it -
sort(1)
- sorting -
uniq(1)
- count -
sort(1)
- sorting by count, reversed
PS. Of course it can be replaced by one awk
/sed
/perl
/python
/etc script. I just wanted to show how rich unix-way is.
Solution 2:
While the one liner by SaveTheRbtz does the job, it took several hours to parse my nginx
access log.
Here is a faster version based on his, which takes less than 1 minute per 100MB of log file (corresponding to about 1 million lines):
sed -n 's!.* "GET.* "\([[:alnum:].]\+/*[[:digit:].]*\)[^"]*"$!\1!p' /var/log/nginx/access.log | sort | uniq -c | sort -rfg
It works with the default access log format of nginx
, which is the same as the combined
format of Apache's httpd
and has the User-Agent
as the last field, delimited by "
.
Solution 3:
This is a slight variation of the accepted answer, using fgrep
and cut
.
cat your_file.log | fgrep '"GET ' | cut -d'"' -f6 | cut -d' ' -f1 | sort | uniq -c | sort -rn
There is something appealing about using "weaker" commands when it is possible.