Using ' (' (space followed by parenthesis) as field separator in awk

Solution 1:

To use ␣( (space+parenthesis) as field separator in awk, use "␣\\\(":

$ echo "a (b (c" | awk -F " \\\(" '{ print $1; print $2; print $3 }'
a
b
c

Alternatively, use single quotes and two backslashes:

$ echo "a (b (c" | awk -F ' \\(' '{ print $1; print $2; print $3 }'
a
b
c

The reason for this is that ␣( (a single parenthesis with a leading space) is a malformed regular expression. The left parenthesis opens a grouping that is never closed. This is why it needs to be escaped.

The reason that ( (a single parenthesis without a leading space) works is that when FS is a single character, it's not treated as a regular expression.

Solution 2:

I found this thread when searching for a solution to a similar problem - using OR ( as a field separator for awk. This didn't quite answer it, but let me to my solution:

if you want the combination ␣( as a single unit to separate files in awk, do awk -F '( \\()' ...:

$ echo "This (maybe) is a test()" | awk -F '( \\()' '{print $1 "\n" $2 "\n" $3 "\n" $4 "\n" $5 "\n" $6 "\n" $7; print "Number of Fields: " NF}'
This
maybe) is a test()





Number of Fields: 2

If you're looking for a solution to my similar problem - either OR (, do awk -F '( |\\()' ...:

$ echo "This (maybe) is a test()" | awk -F '( |\\()' '{print $1 "\n" $2 "\n" $3 "\n" $4 "\n" $5 "\n" $6 "\n" $7; print "Number of Fields: " NF}'
This

maybe)
is
a
test
)
Number of Fields: 7