sed with PCRE (like grep -P)
I am happy that grep
does support Perl Compatible Regular Expressions with the -P
option.
Is there a reason why the tool sed
does not have this feature?
Work-around:
You can use the Pathological Eclectic Rubbish Lister:
perl -pe 's/../../g' file
or inline replace:
perl -i -pe 's/../../g' file
This works for the cases where I use sed
. If things get more complicated I write a small python script.
BTW, I switched to No Shell-Scripting
In the case of GNU Sed, the stated reason appears to be
I was afraid it fell into one of those 'cracks'...though from what was said at the time, some part of the work was already done and it looked like a matter of docs and packaging... (though, I admit, in Computer Sci, the last 10% of the work often takes 90% of the time...
See GNU bug report logs - #22801 status on committed change: upgrading 'sed' RE's to include perlRE syntax - or search the sed-devel Archives for "PCRE" if you want more details.
Don't forget you can use perl
itself for many of the simple one-liners for which you might want to use PCRE in sed
.
As my substitution needs have become more complex, using perl -pe
becomes preferable to sed -e
. In particular, being able to use perl
character classes and the quantifiers is more concise than the hoops I need to jump through for sed
.
journalctl -u auditd -S 'yesterday' |\
perl -pe 's/^(\w{3} \d{2} \d{2}:\d{2}:\d{2}) ([\w-]+) audispd/$1 generic-hostname audispd/;
s/node=[\w-]+/node=generic-hostname/;'
vs
journalctl -u auditd -S "yesterday" |\
sed -e 's/^\([[:alpha:]]\{3\} [[:digit:]]\{2\} [[:digit:]]\{2\}:[[:digit:]]\{2\}:[[:digit:]]\{2\}\) \([[:alpha:]-]\+\) audispd/\1 generic-hostname audispd/;
s/node=\([[:alpha:]-]\+\) /node=generic-hostname /;'
I could use [0-9]
instead of [[:digit:]]
and [A-Za-z]
instead of [[:alpha:]]
, but a) both of those are longer than the perl equivalents and b) [A-Za-z]
will match non-ASCII characters like the perl equivalents can.
bosses-r-dum> echo 'å' | sed -e 's/[A-Za-z]/X/'
å
bosses-r-dum> echo 'å' | perl -CS -pe 's/\w/X/'
X
bosses-r-dum>
If you have to deal with unicode, being able to add a flag and have things "Just Work" is very handy. I tend to grow my regexp's organically, so using the same tool for 'simple' and 'complex' regexp's makes sense because my 'simple' regexp can easily turn into a 'complex' one if/when requirements change and I don't need to do any tooling changes (change all [x]\{#\}
instances into [x]{#}
and the like).