Unicode characters in MATLAB source files
I'd like to use Unicode characters in comments in a MATLAB source file. This seems to work when I write the text; however, if I close the file and reload it, "unusual" characters have been turned into question marks. I guess MATLAB is saving the file as ASCII.
Is there any way to tell MATLAB to use UTF-8 instead?
Solution 1:
According to http://www.mathworks.de/matlabcentral/newsreader/view_thread/238995
feature('DefaultCharacterSet', 'UTF8')
will change the encoding to UTF-8. You can put the line above in your startup.m file.
Solution 2:
How the MATLAB Process Uses Locale Settings shows how to set the encoding for different platforms. Use
feature('DefaultCharacterSet')
You can read more about this undocumented function here. See also this Matlab Central thread for other options.
Solution 3:
Mac OSX only!
As I found solution which worked in my case I want to share it.
Mathworks advises here to use slCharacterEncoding(encoding)
in order to change the encoding as desired, but for the OSX this does not solve the issue exactly as the feature('DefaultCharacterSet')
in accepted answer does not do it. What helped me to get the UTF-8 encoding set for opening and saving .m files was the following link on MATLAB answers:
https://www.mathworks.com/matlabcentral/answers/12422-macosx-encoding-problem
Matlab seems to ignore any value set in slCharacterEncoding(encoding)
or feature('DefaultCharacterSet')
but uses region set in System Preferences -> Language & Region. After checking which region is selected in our case then it is possible to define the actual encoding in the hidden configuration file in
$matlabroot/bin/lcdata.xml
This directory can be opened by getting to the Applications and after right click on Matlab by selecting Show Package Contents as on screenshot (here in German)
For example for German default ISO-8859-1 it is possible to adjust it by changing the respective line in the file lcdata.xml:
<locale name="de_DE" encoding="ISO-8859-1" xpg_name="de_DE.ISO8859-1">
to:
<locale name="de_DE" encoding="UTF-8" xpg_name="de_DE.UTF-8">
If the region which is selected is not present in the lcdata.xml file this will not work.
Hope this helps!
Solution 4:
The solution provided here worked for me on Windows with R2018a.
In case link doesn't work: the idea is to use file matlabroot/bin/lcdata.xml
to configure an alias for encoding name (some explanation can be found in this very file in the comments):
<codeset>
<encoding name="UTF-8">
<encoding_alias name="windows-1252" />
</encoding>
</codeset>
You would use your own value instead of windows-1252
, currently used encoding can be obtained by running feature('locale')
.
Although, if you use Unicode characters in help comments, the help browser does not recognize them, as well as console window output.