Does bash shell interpret usage of quotation mark inconsistently?

Here is my code, I'n on a Macbook Pro mid-2014 updated to the latest. I use Bash 5.1.8

apples-MacBook-Pro:Documents apple$ egrep s* states.txt
apples-MacBook-Pro:Documents apple$ egrep "s*" states.txt
Alabama
Alaska
Arizona
Arkansas
California
Colorado
Connecticut
Delaware
Florida
Georgia
Hawaii
Idaho
Illinois
Indiana
Iowa
Kansas
Kentucky
Louisiana
Maine
Maryland
Massachusetts
Michigan
Minnesota
Mississippi
Missouri
Montana
Nebraska
Nevada
New Hampshire
New Jersey
New Mexico
New York
North Carolina
North Dakota
Ohio
Oklahoma
Oregon
Pennsylvania
Rhode Island
South Carolina
South Dakota
Tennessee
Texas
Utah
Vermont
Virginia
Washington
West Virginia
Wisconsin
Wyoming
apples-MacBook-Pro:Documents apple$ egrep s{2} states.txt
Massachusetts
Mississippi
Missouri
Tennessee
apples-MacBook-Pro:Documents apple$ egrep "s{2}" states.txt
Massachusetts
Mississippi
Missouri
Tennessee

As you can see, if I don't quote s*, egrep doesn't interpret the metacharacter *,(I assume it treats the * as a literal ?) but if I quoted s, it is treates as a metacharacter, as expected. This, however, is not the case with the usage of {}, regardless of whether I quoted the regex or not, it is interpreted by the shell as a metachracter.

Why the discrepency?


Solution 1:

As you can see, if I don't quote s*, egrep doesn't interpret the metacharacter *,(I assume it treats the * as a literal ?) but if I quoted s, it is treates as a metacharacter, as expected

If you don't quote the s*, egrep does not even receive the metacharacter * as the wildcard is expanded by bash before command execution – this is the usual way file wildcards work in Unix shells.

As you have a file matching this wildcard (the very same states.txt), the actual command that will be run by bash is egrep states.txt states.txt. (Of course, if there are more files beginning with 's', they will be included as additional arguments.)

Only if the wildcard matches no files it is then passed to the program unaltered (e.g. xnughxkrtb* will probably remain as-is). You can find out the actual command by prepending echo – as wildcard expansion is done by the shell, it will be done equally for echo as well as egrep.

On the other hand, if you do quote the s*, it is received by egrep exactly as written. (Though in egrep it is not a wildcard but a regexp, one that matches literally everything, as any string has "zero or more" 's' characters, which is why it outputs all states.)

The shell also recognizes \ to suppress a special character, so egrep s\* states.txt would also have worked. (If you want egrep itself to receive the backslash literally, you may need to double it.)

This, however, is not the case with the usage of {}, regardless of whether I quoted the regex or not, it is interpreted by the shell as a metachracter.

This is almost the same situation, with {} being another type of a shell expansion. However, in your case it happens to still work unquoted because this type of shell expansion does not trigger unless it has at least two comma-separated items (or a range). That is, {a,b} or {a..z} are expanded by bash but {a} alone isn't.

For example, if you had tried s{2,5} without quoting, this would have been expanded by the shell, and the resulting command would've been egrep s2 s5 states.txt. (Again, you can detect this by using echo first.)