Shell script: Find entries in access log with 500 response within a specified date period
Can someone help me with a shell script to figure out the number of 500 HTTP response entries in an access log within a time frame specified?
Solution 1:
You can use awk
to filter in specified time range:
# awk '$9 == "500" && $4 <= to && $4 >= from { print $0 }' from="[02/Aug/2011:14:30:00 +0700]" to="[02/Aug/2011:14:32:00 +0700]" /path/to/your/access_log | wc -l
The status code and timestamp fields may have different order. Also change from
and to
to corresponding format which you are using.
Solution 2:
OK. You can convert to Epoch time to compare:
#!/bin/bash
from=$(date -d "$(echo "$1" | awk 'BEGIN { FS = "[/:]"; } { print $1" "$2" "$3" "$4":"$5":"$6 }')" +%s)
to=$(date -d "$(echo "$2" | awk 'BEGIN { FS = "[/:]"; } { print $1" "$2" "$3" "$4":"$5":"$6 }')" +%s)
while read line
do
date=$(echo $line | awk '{ print substr($4, 2, length($4)-1) }' | awk 'BEGIN { FS = "[/:]"; } { print $1" "$2" "$3" "$4":"$5":"$6 }')
date=$(date -d "$date" +%s)
[[ $date -ge $from && $date -le $to ]] && echo $line
done < $3
and call it with something like this:
./log_filtering.sh 30/Jul/2011:15:55:44 02/Aug/2011:01:00:00 access_log
I'm trying to write in one-line.
Doing with awk
:
#!/bin/awk -f
function toEpoch(t, a) {
split(t, a, "[/:]")
split("Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec", monthname, " ")
for (i=1; i<=12; i++) month[monthname[i]] = i
a[2] = month[a[2]]
return(mktime(a[3]" "a[2]" "a[1]" "a[4]" "a[5]" "a[6]))
}
BEGIN {
start = toEpoch(starttime)
end = toEpoch(endtime)
}
{ date = toEpoch(substr($4, 2, length($4)-1)) }
( date >= start ) && ( date <= end )
and passing the arguments with -v
:
gawk -f log_filtering.awk -v starttime=30/Jul/2011:04:12:24 -v endtime=02/Aug/2011:04:12:27 access_log