Perl for matching with regular expressions in Terminal?
I'm trying to familiarize myself a little with Perl to use for regular expression searches in Terminal (Mac). Now, I'm not really looking to learn Perl rigourously, just trying to find out how to do some simple regular expressions.
But I can't figure out how to do this in Terminal:
I'd like to be able to match expressions over several lines, and I'll take HTML tags as an example. PLEASE NOTE, that the HTML tag is just an example of something to match, and specifically something that goes over multiple lines. Whether matching HTML with regular expressionS is a good idea or not is not the issue. I just want to understand the syntax of matching with Perl on the command line!
Say I want to match the entire ul tag here:
<ul>
<li>item 1</li>
<li>item 2</li>
</ul>
I would like to:
- Be able to match this in a file and output the match to the stdout (don't ask why, I would just want to to understand how it works :-))
- Be able to replace it with something else.
For matching, I found something like this (using 'start' and 'end' as an example here from a simple text file when I was testing, but please give the example for the ul
tag instead:
perl -wnE 'say $1 if /(start(.*?)end)/' test.txt
This matches a part, but only on one line. Surprisingly, adding the s at the end didn't work to make it "dotall" or "single-line mode", it still just matched one line...
For replacing, I tried something like this:
perl -pe 's/start(.*?)end/replacement text/'s test.txt
This didn't work either...
Well, here's a wikipedia page for matching or replacing with Perl one liners. I did this in Cygwin:
Perl can behave like grep or like sed.
The /s
makes dot match new line.
The -0777
makes it apply the regular expression to the whole thing instead of line by line.
\n
can match new line as well.
$ echo -e 'a\nb\nc\nd' | perl -0777 -pe 's/.*c//s'
d
user@comp ~
$ echo -e 'a\nb\nc\nd' | perl -pe 's/.*c//s'
a
b
d
Here is the other form, -ne
with print $1
:
user@comp ~
$ echo -e 'a\nb\nc\nd' | perl -ne 'print $1 if /(.*c)/s'
c
user@comp ~
$ echo -e 'a\nb\nc\nd' | perl -0777 -ne 'print $1 if /(.*c)/s'
a
b
c
user@comp ~
$
Also
$ echo xxx|perl -lne 'print ""'
Perl's equivalent of \0 or &, i.e. the whole match is $_ or to be able to put text before and after without a space, ${_}
$ echo xxx|perl -lne 'print "a${_}${_}a"'
axxxxxxa
and
$ echo xxx|perl -lpe 's/.*/a${_}${_}a"/'
axxxxxxa"
###Some further examples
$ cat t.t
<ul>
<li>item 1</li>
<li>item 2</li>
</ul>
$ perl -0777 -ne 'print $1 if /\<ul\>(.*?)\<\/ul>/s' t.t
<li>item 1</li>
<li>item 2</li>
user@comp ~
$ perl -0777 -ne 'print $1 if /(.*)/s' t.t
<ul>
<li>item 1</li>
<li>item 2</li>
</ul>
user@comp ~
$
An example of Global for the -ne
one (change "if" to "while"):
$ echo -e 'bbb' | perl -0777 -ne 'print $1 while /(b)/sg'
bbb
For the -pe
one, just add the g
at the end (/sg
or /gs
, same thing):
$ echo -e 'aaa' | perl -0777 -pe 's/a/z/s'
zaa
user@comp ~
$ echo -e 'aaa' | perl -0777 -pe 's/a/z/sg'
zzz
Note- This question contrasts /s and -0777
Those print $1
examples don't show the whole line. this link https://dzone.com/articles/perl-as-a-better-grep has this example that does perl -wln -e "/RE/ and print;" foo.txt