Splitting a file in linux based on content [duplicate]

If you have a mail.txt

$ cat mail.txt
<html>
    mail A
</html>

<html>
    mail B
</html>

<html>
    mail C
</html>

run csplit to split by <html>

$ csplit mail.txt '/^<html>$/' '{*}'

 - mail.txt    => input file
 - /^<html>$/  => pattern match every `<html>` line
 - {*}         => repeat the previous pattern as many times as possible

check output

$ ls
mail.txt  xx00  xx01  xx02  xx03

If you want do it in awk

$ awk '/<html>/{filename=NR".txt"}; {print >filename}' mail.txt
$ ls
1.txt  5.txt  9.txt  mail.txt

The csplit program solves your problem elegantly:

csplit '/<!DOCTYPE.*/' $FILE

Splitting a file in linux based on content [duplicate]

Related

Recent Posts