sed with PCRE (like grep -P)

I am happy that grep does support Perl Compatible Regular Expressions with the -P option.

Is there a reason why the tool sed does not have this feature?

Work-around:

You can use the Pathological Eclectic Rubbish Lister:

perl -pe 's/../../g' file

or inline replace:

perl -i -pe 's/../../g' file

This works for the cases where I use sed. If things get more complicated I write a small python script.

BTW, I switched to No Shell-Scripting

In the case of GNU Sed, the stated reason appears to be

I was afraid it fell into one of those 'cracks'...though from what was said at the time, some part of the work was already done and it looked like a matter of docs and packaging... (though, I admit, in Computer Sci, the last 10% of the work often takes 90% of the time...

See GNU bug report logs - #22801 status on committed change: upgrading 'sed' RE's to include perlRE syntax - or search the sed-devel Archives for "PCRE" if you want more details.

Don't forget you can use perl itself for many of the simple one-liners for which you might want to use PCRE in sed.

As my substitution needs have become more complex, using perl -pe becomes preferable to sed -e. In particular, being able to use perl character classes and the quantifiers is more concise than the hoops I need to jump through for sed.

journalctl -u auditd -S 'yesterday' |\
  perl -pe 's/^(\w{3} \d{2} \d{2}:\d{2}:\d{2}) ([\w-]+) audispd/$1 generic-hostname audispd/;
      s/node=[\w-]+/node=generic-hostname/;'

journalctl -u auditd -S "yesterday" |\
  sed -e 's/^\([[:alpha:]]\{3\} [[:digit:]]\{2\} [[:digit:]]\{2\}:[[:digit:]]\{2\}:[[:digit:]]\{2\}\) \([[:alpha:]-]\+\) audispd/\1 generic-hostname audispd/;
      s/node=\([[:alpha:]-]\+\) /node=generic-hostname /;'

I could use [0-9] instead of [[:digit:]] and [A-Za-z] instead of [[:alpha:]], but a) both of those are longer than the perl equivalents and b) [A-Za-z] will match non-ASCII characters like the perl equivalents can.

bosses-r-dum> echo 'å' | sed -e 's/[A-Za-z]/X/'
å
bosses-r-dum> echo 'å' | perl -CS -pe 's/\w/X/'
X
bosses-r-dum>

If you have to deal with unicode, being able to add a flag and have things "Just Work" is very handy. I tend to grow my regexp's organically, so using the same tool for 'simple' and 'complex' regexp's makes sense because my 'simple' regexp can easily turn into a 'complex' one if/when requirements change and I don't need to do any tooling changes (change all [x]\{#\} instances into [x]{#} and the like).

sed with PCRE (like grep -P)

Related

Recent Posts