How to get text from range of dates using grep/sed in large text file?

With grep if you know the number of lines you want you can use context option -A to print lines after the pattern

grep -A 3 2016-07-13 file

that will give you the line with 2013-07-13 and the next 3 lines

with sed you can use the dates to delimit like this

sed -n '/2016-07-13/,/2016-07-19/p' file

which will print all lines from the first line with 2016-07-13 up to and including the first line with 2016-07-19. But that assumes you have only one line with 2016-07-19 (it will not print the next line). If there are multiple lines use the next date instead and use d to delete the output from it

sed -n '/2016-07-13/,/2016-07-20/{/2016-07-20/d; p}' file

This simple grep one liner will be enough:

grep -E ^2016-07-1[3-9] filename

Works nicely here and there is no need for sed :)

References:

  • Matching Numeric Ranges with a Regular Expression

awk solution:

$ awk '/^2016-07-13.*/,/2016-07-19.*/'  input.txt                                   
2016-07-13 < ?xml version> 
2016-07-18 < ?xml version> 
2016-07-18 < ?xml version> 
2016-07-19 < ?xml version> 

Basically prints any line from the one that starts with 2016-07-13 to the one that starts with 2016-07-19


All the other current answers rely on the fact that the log file entries are sorted chronologically or the fact that the date range can be matched easily with regular expressions. If you want a more generic solution, we need to do some more programming.

I present this GNU AWK script:

#!/usr/bin/gawk -f
BEGIN {
    starttime = mktime(starttime)
    endtime = mktime(endtime)
}

func in_range(n, start, end) {
    return start <= n && n < end
}

match($0, /^([0-9]{4})-([0-9]{2})-([0-9]{2})\s/, m) &&
    in_range(mktime(m[1] " " m[2] " " m[3] " 00 00 00"), starttime, endtime)

You supply the start and end time through the variables starttime and endtime in a format that mktime understands (YYYY MM DD hh dd ss). Thus you run the awk command like so, assuming that the above Awk script is in an executable file filter-log-dates.awk in the current working directory and the log file is mylog.txt:

./filter-log-dates.awk -v starttime='2016 07 13 00 00 00' -v endtime='2016 07 20 00 00 00' mylog.txt

Note that the end time is exclusive, i. e. valid log records must have a time stamp before the end time.

If your time stamp format is different, you can adjust the regular expression passed to the match function to suit it.