After setting EnableHexNumpad=1, hex Unicode alt-codes work well in Notepad++ and Excel but have a strange behavior in Word
I try to summarize what is the behavior I see, keeping into consideration that Windows can't handle codes above 0xFFFF
:
- Word works flawlessly with decimal code point inserted through Alt+decimal. As an example, no problem in inserting even
๐
as Alt+128512, (As a side note, Notepad++ doesn't manage this shortcut and put\0
in the file. Excel ignores the input.) - Word accepts hex code for low code point blocks. However for some (many?) higher Unicode blocks, if there aren't hex
A-F
digits in the code, the code is interpreted as decimal unavoidably. As an example, let's take the "Ethiopic" code block, which begins at0x1200
and enter in Word the sequence Alt++1200, Alt++1201, ..., Alt++120F, i.e. the first row of the block. We would expect to insert the charactersแแแแแแ แแแแแแแแแแ
. Instead I seeาฐาฑาฒาณาดาตาถาทาธานแแแแแแ
, so that last six characters are correct, while first ten are not: they are from code points0x4B0
-0x4B9
, or in decimal format1200
-1209
. The error is apparent: when there aren'tA-F
digits in the code, it is interpreted as decimal even if the + is prepended. Notepad++ and Excel work as expected for these cases. This seems to be linked to an internal association between available font glyphs, but I didn't get any definitive conclusion. - For completely unsupported code blocks,
A-F
digits aren't considered and only the numbers concur, interpreted as decimals. As an example, let's enter Alt++30C4. Instead of theใ
katakana I obtainฤฐ
, which is code point0x130
or decimal304
(the original30C4
string withoutC
). Even with hex codes of more than 4 digits one has the same behavior: attempting to insert Alt++1f600, the emoji of the point 1., inputs0x640
or decimal1600
. In Excel this latter alt code inserts0xF600
(verified withUNICODE()
function), which is invalid and shown as๏
but, keeping in mind the 4-digit limitation, this seems reasonable.
So, is it all about a simply misconfigured system, or is there some option I can explore to revert to expected behavior (Word 365 MSO (16.0.14131.20278) 32-bit)?
Solution 1:
I would chalk this up to bugs in Word, which is well-known for carrying ancient methods that may work differently than newly-programmed ones.
For example, typing in Word "1200 alt x" gives แ
as expected,
while typing "alt + 1200" gives าฐ
.
The interesting part here is that the Unicode hex code of าฐ
is 4B0
.
I note that decimal 1200
converted into hex is 4B0
.
I also note that "alt + 4B0" also gives าฐ
.
From this I conclude that Word will do the following irrational test:
If after "alt +" the entered string contains only digits ("1200")
it will assume that it's written in decimal, but if it contains one of
the letters a-f
("4B0") it is taken as hexadecimal.
This theory of mine is born out by your tests - when your entered
codes started including the letters a-f
, they were interpreted
correctly as hex. As long as they only contained decimal digits,
they were wrongly interpreted as being decimal.
The implementation by Microsoft of the EnableHexNumpad
option
seems to be very flawed.
Word cannot be fixed by you or me. The most you can do is signal the problem to Microsoft via the Feedback Hub (which wouldn't help much).
If you need a third-party utility that doesn't have such gotchas, you may for example use the ancient UnicodeInput which still works in Windows 10 for entering Unicode. It intercepts Alt+ and puts up a dialog box where the Unicode can be entered.