Split log file by date
Solution 1:
A perl
solution, taking advantage of GNU date
to convert the dates:
perl -ne 'if(/^###<(.*)>/){
chomp($d=`date -d \"$1\" +%Y_%m_%d`);
$name="$d.log"
}
open(my $fh,">>","$name");
print $fh $_;' file.log
Explanation
-
-ne
: read the input file line by line (saving each line as the special variable$_
) and apply the script given by-e
to each line. -
if(/^###<(.*)>/)
: if the line starts with###<
, capture everything between the<>
as$1
(that's what the parentheses do). -
chomp($d=
date -d \"$1\" +%Y_%m_%d);
: thedate
command reformats the date. For example:$ date -d "Sep 1, 2016 1:00:01 AM" +%Y_%m_%d 2016_09_01
The
chomp
removes the final newline from the result ofdate
so we can use it later. -
$name="$d.log"
: we save the result of thedate
command plus.log
as the variable$name
. -
open(my $fh,">>","$name");
: open the file$name
as the file handle$fh
. Don't worry if you don't know what a file handle is, this just means thatprint $fh "foo"
will printfoo
into$name
. -
print $fh $_;
: print the current line into the file that the file handle$fh
points to. So, print the line into whatever is currently saved as$name
.
Solution 2:
One approach for solving this could be to use awk. For example, this command:
awk -F'[ <,]+' '/^###/{close(f);f=$4"_"$2"_"$3".log"}{print >> f}END{close(f)}' file
should split the file into the files, using the date fields as filenames
Solution 3:
With awk
:
awk '/^#+<[^>]+>$/ {if (lines) print lines >file; \
dt=gensub("^#+<([^>]+)>$", "\\1", $0)
dt_cmd="date -d \""dt"\" +%Y_%m_%d.log" \
dt_cmd | getline file; lines=$0; next}; \
{lines=lines ORS $0} END {print lines >file}' file.log
Readable form:
awk '
/^#+<[^>]+>$/ {
if (lines)
print lines >file
dt=gensub("^#+<([^>]+)>$", "\\1", $0)
dt_cmd="date -d \""dt"\" +%Y_%m_%d.log"
dt_cmd | getline file; lines=$0
next
}
{
lines=lines ORS $0
}
END {
print lines >file
}' file.log
/^#+<[^>]+>$/
matches the lines containing dates, the chunk surrounded by{}
will only be run if the condition matches. If matches, we are getting the date in desired format by using externaldate
command and saving the output in variablefile
, and saving the content of variablelines
so far as filefile
(from previous chunk), and then instantiate the variablelines
again with the lineFor all other lines, we concatenating the lines as variable
lines
The last chunk is saved by putting in the
END
block