Unix grep regex containing 'x' but not containing 'y'
I need a single-pass regex for unix grep that contains, say alpha, but does not contain beta.
grep 'alpha' <> | grep -v 'beta'
Solution 1:
The other answers here show some ways you can contort different varieties of regex to do this, although I think it does turn out that the answer is, in general, “don’t do that”. Such regular expressions are much harder to read and probably slower to execute than just combining two regular expressions using the boolean logic of whatever language you are using. If you’re using the grep
command at a unix shell prompt, just pipe the results of one to the other:
grep "alpha" | grep -v "beta"
I use this kind of construct all the time to winnow down excessive results from grep
. If you have an idea of which result set will be smaller, put that one first in the pipeline to get the best performance, as the second command only has to process the output from the first, and not the entire input.
Solution 2:
Well as we're all posting answers, here it is in awk ;-)
awk '/x/ && !/y/' infile
I hope this helps.
Solution 3:
^((?!beta).)*alpha((?!beta).)*$
would do the trick I think.
Solution 4:
I'm pretty sure this isn't possible with true regular expressions. The [^y]*x[^y]*
example would match yxy, since the * allows zero or more non-y matches.
EDIT:
Actually, this seems to work: ^[^y]*x[^y]*$
. It basically means "match any line that starts with zero or more non-y characters, then has an x, then ends with zero or more non-y characters".