Can GNU sed (for Windows) handle Unicode? If so, is it a code-page/locale issue, or a switch?

Solution 1:

I don't know a ton about sed, but after some hard Googling it seems to have support for a variety of code pages through the LANG environment variable. I believe UTF-8 is in fact the default in the absence of LANG. I don't know how the Windows port is set up though. I do have a strong suspicion that sed performs no detection processing at all on the input stream.


You could also try escape characters as mentioned here: That seems very cumbersome though.