James Gosling's explanation of why Java's byte is signed

I was initially surprised that Java decides to specify that byte is signed, with a range from -128..127 (inclusive). I'm under the impression that most 8-bit number representations are unsigned, with a range of 0..255 instead (e.g. IPv4 in dot-decimal notation).

So has James Gosling ever been asked to explain why he decided that byte is signed? Has there been notable discussions/debates about this issue in the past between authoritative programming language designers and/or critics?


It appears that simplicity was the main reason. From this interview:

Gosling: For me as a language designer, which I don't really count myself as these days, what "simple" really ended up meaning was could I expect J. Random Developer to hold the spec in his head. That definition says that, for instance, Java isn't -- and in fact a lot of these languages end up with a lot of corner cases, things that nobody really understands. Quiz any C developer about unsigned, and pretty soon you discover that almost no C developers actually understand what goes on with unsigned, what unsigned arithmetic is. Things like that made C complex. The language part of Java is, I think, pretty simple. The libraries you have to look up.

My initial assumption was that it's because Java doesn't have unsigned numeric types at all. Why should byte be an exception? char is a special case because it has to represent UTF-16 code units (thanks to Jon Skeet for the quote)


As per 'Oak Language Specification 0.2' aka Java language:

"The Oak byte type is what C programmers are used to thinking of as the char type. But in the Oak language, characters are 16 bits wide. Having a separate byte type removes the confusion in C between the interpretation of char as an 8 bit integer and as a character."

You can grab a postscript copy from here :

http://cretesoft.com/archive/files/OakSpec0.2.ps (partial copy on scribd)

Also there is a part of interview posted on this site: (Where he is defending the absence of unsigned byte in java)

http://www.darksleep.com/player/JavaAndUnsignedTypes.html

Adding the interview taken from the above mentioned page...

*" http://www.gotw.ca/publications/c_family_interview.htm

Q: Programmers often talk about the advantages and disadvantages of programming in a "simple language." What does that phrase mean to you, and is [C/C++/Java] a simple language in your view?

Ritchie: [deleted for brevity]

Stroustrup: [deleted for brevity]

Gosling: For me as a language designer, which I don't really count myself as these days, what "simple" really ended up meaning was could I expect J. Random Developer to hold the spec in his head. That definition says that, for instance, Java isn't -- and in fact a lot of these languages end up with a lot of corner cases, things that nobody really understands. Quiz any C developer about unsigned, and pretty soon you discover that almost no C developers actually understand what goes on with unsigned, what unsigned arithmetic is. Things like that made C complex. The language part of Java is, I think, pretty simple. The libraries you have to look up.

On the other hand.... According to http://www.artima.com/weblogs/viewpost.jsp?thread=7555

Once Upon an Oak ... by Heinz Kabutz July 15, 2003

... Trying to fill my gaps of Java's history, I started digging around on Sun's website, and eventually stumbled across the Oak Language Specification for Oak version 0.2. Oak was the original name of what is now commonly known as Java, and this manual is the oldest manual available for Oak (i.e. Java). ... Unsigned integer values (Section 3.1)

The specification says: "The four integer types of widths of 8, 16, 32 and 64 bits, and are signed unless prefixed by the unsigned modifier.

In the sidebar it says: "unsigned isn't implemented yet; it might never be." How right you were. "*


I'm not aware of any direct quotes from James Gosling, but there's an official RFE for unsigned byte:

Bug ID: 4186775: request unsigned integer types, esp. unsigned byte

State: 11-Closed, Will Not Fix, request for enhancement

Please extend the Java design to allow unsigned types, particularly unsigned byte.

I have been wondering why there are no unsigned integer types in Java. It seems to me that for byte-length values it is extremely awkward not to have them [...]

I recognize that this was a design decision made by the Java developers. What I don't understand is why. Did they consider unsigned integer types evil or harmful, and chose to protect me from myself?


There's no reason for a byte to be unsigned. when you have char type to represent characters, the byte would normally not do that job of a char.