Byte and char conversion in Java
Solution 1:
A character in Java is a Unicode code-unit which is treated as an unsigned number. So if you perform c = (char)b
the value you get is 2^16 - 56 or 65536 - 56.
Or more precisely, the byte is first converted to a signed integer with the value 0xFFFFFFC8
using sign extension in a widening conversion. This in turn is then narrowed down to 0xFFC8
when casting to a char
, which translates to the positive number 65480
.
From the language specification:
5.1.4. Widening and Narrowing Primitive Conversion
First, the byte is converted to an int via widening primitive conversion (§5.1.2), and then the resulting int is converted to a char by narrowing primitive conversion (§5.1.3).
To get the right point use char c = (char) (b & 0xFF)
which first converts the byte value of b
to the positive integer 200
by using a mask, zeroing the top 24 bits after conversion: 0xFFFFFFC8
becomes 0x000000C8
or the positive number 200
in decimals.
Above is a direct explanation of what happens during conversion between the byte
, int
and char
primitive types.
If you want to encode/decode characters from bytes, use Charset
, CharsetEncoder
, CharsetDecoder
or one of the convenience methods such as new String(byte[] bytes, Charset charset)
or String#toBytes(Charset charset)
. You can get the character set (such as UTF-8 or Windows-1252) from StandardCharsets
.