Weird grep behavior with CJK characters? (bash)

grep fails to match certain strings with CJK characters. For example.

  1. Create a text file with content below:
  1. Use grep.
    >> grep "ShellType.サモナ\u30FC" test.txt
    (empty output)
    >> grep "ShellType.サモナ.*\u30FC" test.txt

Is this a grep bug or CJK characters need special handling?
How to properly search with CJK strings with grep, or other reliable tools?

System: Ubuntu 20.04
GNU bash, version 5.0.17(1)-release (x86_64-pc-linux-gnu)
grep (GNU grep) 3.4

It has nothing to do with CJK. You can use -o to (more or less) see what \u actually means in grep:

[tom@ideapad ~]$ cat /tmp/meh 
[tom@ideapad ~]$ grep -o '\u' /tmp/meh 
[tom@ideapad ~]$ grep -o '.\u' /tmp/meh 
[tom@ideapad ~]$ grep -o '.*\u' /tmp/meh 
[tom@ideapad ~]$ grep -o '.*.*\u' /tmp/meh 
[tom@ideapad ~]$ grep -o '==ShellType.サモナ.*\u' /tmp/meh
[tom@ideapad ~]$ grep -o '==ShellType.サモナ.\u' /tmp/meh

Note that I've been using single quotes since with \, double quotes could make things even more complicated. The proper way to do the grep you (seem to) desire are:

[tom@ideapad ~]$ grep -o '==ShellType\.サモナ\\u' /tmp/meh 
[tom@ideapad ~]$ grep -o "==ShellType\\.サモナ\\\\u" /tmp/meh 

As far as I know, grep does not consider \u30FC (however further escaped) to be a unicode character like printf in a shell does. To actually grep one with its code point, you can make the shell expand it first with ANSI-C quoting (it might not work in every POSIX shell though):

[tom@ideapad ~]$ printf '\u30FC' > /tmp/heh
[tom@ideapad ~]$ grep $'\u30FC' /tmp/heh 

P.S. It might be worth mentioning that, while ANSI-C quoting makes use of single quotes in its syntax, it does NOT mean that it works like single quotes for the parts other than the code point expansion:

[tom@ideapad ~]$ grep -o $'==ShellType\.サモナ\\u' /tmp/meh 
[tom@ideapad ~]$ grep -o $'==ShellType\\.サモナ\\\\u' /tmp/meh 