Difference between BYTE and CHAR in column datatypes

Solution 1:

Let us assume the database character set is UTF-8, which is the recommended setting in recent versions of Oracle. In this case, some characters take more than 1 byte to store in the database.

If you define the field as VARCHAR2(11 BYTE), Oracle can use up to 11 bytes for storage, but you may not actually be able to store 11 characters in the field, because some of them take more than one byte to store, e.g. non-English characters.

By defining the field as VARCHAR2(11 CHAR) you tell Oracle it can use enough space to store 11 characters, no matter how many bytes it takes to store each one. A single character may require up to 4 bytes.

Solution 2:

One has exactly space for 11 bytes, the other for exactly 11 characters. Some charsets such as Unicode variants may use more than one byte per char, therefore the 11 byte field might have space for less than 11 chars depending on the encoding.

See also http://www.joelonsoftware.com/articles/Unicode.html

Solution 3:

Depending on the system configuration, size of CHAR mesured in BYTES can vary. In your examples:

  1. Limits field to 11 BYTE
  2. Limits field to 11 CHARacters


Conclusion: 1 CHAR is not equal to 1 BYTE.

Solution 4:

I am not sure since I am not an Oracle user, but I assume that the difference lies when you use multi-byte character sets such as Unicode (UTF-16/32). In this case, 11 Bytes could account for less than 11 characters.

Also those field types might be treated differently in regard to accented characters or case, for example 'binaryField(ete) = "été"' will not match while 'charField(ete) = "été"' might (again not sure about Oracle).

Solution 5:

In simple words when you write NAME VARCHAR2(11 BYTE) then only 11 Byte can be accommodated in that variable.

No matter which characters set you are using, for example, if you are using Unicode (UTF-16) then only half of the size of Name can be accommodated in NAME.

On the other hand, if you write NAME VARCHAR2(11 CHAR) then NAME can accommodate 11 CHAR regardless of their character encoding.

BYTE is the default if you do not specify BYTE or CHAR

So if you write NAME VARCHAR2(4000 BYTE) and use Unicode(UTF-16) character encoding then only 2000 characters can be accommodated in NAME

That means the size limit on the variable is applied in BYTES and it depends on the character encoding that how many characters can be accommodated in that vraible.