How can I search for a multiline pattern in a file?
Solution 1:
Why don't you go for awk:
awk '/Start pattern/,/End pattern/' filename
Solution 2:
So I discovered pcregrep which stands for Perl Compatible Regular Expressions GREP.
For example, you need to find files where the '_name' variable is immediatelly followed by the '_description' variable:
find . -iname '*.py' | xargs pcregrep -M '_name.*\n.*_description'
Tip: you need to include the line break character in your pattern. Depending on your platform, it could be '\n', \r', '\r\n', ...
Solution 3:
Here is the example using GNU grep
:
grep -Pzo '_name.*\n.*_description'
-z
/--null-data
Treat input and output data as sequences of lines.
See also here
Solution 4:
grep -P
also uses libpcre, but is much more widely installed. To find a complete title
section of an html document, even if it spans multiple lines, you can use this:
grep -P '(?s)<title>.*</title>' example.html
Since the PCRE project implements to the perl standard, use the perl documentation for reference:
- http://perldoc.perl.org/perlre.html#Modifiers
- http://perldoc.perl.org/perlre.html#Extended-Patterns
Solution 5:
Here is a more useful example:
pcregrep -Mi "<title>(.*\n){0,5}</title>" afile.html
It searches the title tag in a html file even if it spans up to 5 lines.
Here is an example of unlimited lines:
pcregrep -Mi "(?s)<title>.*</title>" example.html