Why are hexadecimal numbers prefixed with 0x?
Why are hexadecimal numbers prefixed as 0x
?
I understand the usage of the prefix but I don't understand the significance of why 0x
was chosen.
Short story: The 0
tells the parser it's dealing with a constant (and not an identifier/reserved word). Something is still needed to specify the number base: the x
is an arbitrary choice.
Long story: In the 60's, the prevalent programming number systems were decimal and octal — mainframes had 12, 24 or 36 bits per byte, which is nicely divisible by 3 = log2(8).
The BCPL language used the syntax 8 1234
for octal numbers. When Ken Thompson created B from BCPL, he used the 0
prefix instead. This is great because
- an integer constant now always consists of a single token,
- the parser can still tell right away it's got a constant,
- the parser can immediately tell the base (
0
is the same in both bases), - it's mathematically sane (
00005 == 05
), and - no precious special characters are needed (as in
#123
).
When C was created from B, the need for hexadecimal numbers arose (the PDP-11 had 16-bit words) and all of the points above were still valid. Since octals were still needed for other machines, 0x
was arbitrarily chosen (00
was probably ruled out as awkward).
C# is a descendant of C, so it inherits the syntax.
Note: I don't know the correct answer, but the below is just my personal speculation!
As has been mentioned a 0 before a number means it's octal:
04524 // octal, leading 0
Imagine needing to come up with a system to denote hexadecimal numbers, and note we're working in a C style environment. How about ending with h like assembly? Unfortunately you can't - it would allow you to make tokens which are valid identifiers (eg. you could name a variable the same thing) which would make for some nasty ambiguities.
8000h // hex
FF00h // oops - valid identifier! Hex or a variable or type named FF00h?
You can't lead with a character for the same reason:
xFF00 // also valid identifier
Using a hash was probably thrown out because it conflicts with the preprocessor:
#define ...
#FF00 // invalid preprocessor token?
In the end, for whatever reason, they decided to put an x after a leading 0 to denote hexadecimal. It is unambiguous since it still starts with a number character so can't be a valid identifier, and is probably based off the octal convention of a leading 0.
0xFF00 // definitely not an identifier!
It's a prefix to indicate the number is in hexadecimal rather than in some other base. The C programming language uses it to tell compiler.
Example:
0x6400
translates to 6*16^3 + 4*16^2 + 0*16^1 +0*16^0 = 25600.
When compiler reads 0x6400
, It understands the number is hexadecimal with the help of 0x term. Usually we can understand by (6400)16 or (6400)8 or whatever ..
For binary it would be:
0b00000001
Hope I have helped in some way.
Good day!