Converting char[] to byte[]

Solution 1:

Convert without creating String object:

import java.nio.CharBuffer;
import java.nio.ByteBuffer;
import java.util.Arrays;

byte[] toBytes(char[] chars) {
  CharBuffer charBuffer = CharBuffer.wrap(chars);
  ByteBuffer byteBuffer = Charset.forName("UTF-8").encode(charBuffer);
  byte[] bytes = Arrays.copyOfRange(byteBuffer.array(),
            byteBuffer.position(), byteBuffer.limit());
  Arrays.fill(byteBuffer.array(), (byte) 0); // clear sensitive data
  return bytes;
}

Usage:

char[] chars = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9'};
byte[] bytes = toBytes(chars);
/* do something with chars/bytes */
Arrays.fill(chars, '\u0000'); // clear sensitive data
Arrays.fill(bytes, (byte) 0); // clear sensitive data

Solution is inspired from Swing recommendation to store passwords in char[]. (See Why is char[] preferred over String for passwords?)

Remember not to write sensitive data to logs and ensure that JVM won't hold any references to it.


The code above is correct but not effective. If you don't need performance but want security you can use it. If security also not a goal then do simply String.getBytes. Code above is not effective if you look down of implementation of encode in JDK. Besides you need to copy arrays and create buffers. Another way to convert is inline all code behind encode (example for UTF-8):

val xs: Array[Char] = "A ß € 嗨 𝄞 🙂".toArray
val len = xs.length
val ys: Array[Byte] = new Array(3 * len) // worst case
var i = 0; var j = 0 // i for chars; j for bytes
while (i < len) { // fill ys with bytes
  val c = xs(i)
  if (c < 0x80) {
    ys(j) = c.toByte
    i = i + 1
    j = j + 1
  } else if (c < 0x800) {
    ys(j) = (0xc0 | (c >> 6)).toByte
    ys(j + 1) = (0x80 | (c & 0x3f)).toByte
    i = i + 1
    j = j + 2
  } else if (Character.isHighSurrogate(c)) {
    if (len - i < 2) throw new Exception("overflow")
    val d = xs(i + 1)
    val uc: Int = 
      if (Character.isLowSurrogate(d)) {
        Character.toCodePoint(c, d)
      } else {
        throw new Exception("malformed")
      }
    ys(j) = (0xf0 | ((uc >> 18))).toByte
    ys(j + 1) = (0x80 | ((uc >> 12) & 0x3f)).toByte
    ys(j + 2) = (0x80 | ((uc >>  6) & 0x3f)).toByte
    ys(j + 3) = (0x80 | (uc & 0x3f)).toByte
    i = i + 2 // 2 chars
    j = j + 4
  } else if (Character.isLowSurrogate(c)) {
    throw new Exception("malformed")
  } else {
    ys(j) = (0xe0 | (c >> 12)).toByte
    ys(j + 1) = (0x80 | ((c >> 6) & 0x3f)).toByte
    ys(j + 2) = (0x80 | (c & 0x3f)).toByte
    i = i + 1
    j = j + 3
  }
}
// check
println(new String(ys, 0, j, "UTF-8"))

Excuse me for using Scala language. If you have problems with converting this code to Java I can rewrite it. What about performance always check on real data (with JMH for example). This code looks very similar to what you can see in JDK[2] and Protobuf[3].

Solution 2:

char[] ch = ?
new String(ch).getBytes();

or

new String(ch).getBytes("UTF-8");

to get non-default charset.

Update: Since Java 7: new String(ch).getBytes(StandardCharsets.UTF_8);

Solution 3:

Edit: Andrey's answer has been updated so the following no longer applies.

Andrey's answer (the highest voted at the time of writing) is slightly incorrect. I would have added this as comment but I am not reputable enough.

In Andrey's answer:

char[] chars = {'c', 'h', 'a', 'r', 's'}
byte[] bytes = Charset.forName("UTF-8").encode(CharBuffer.wrap(chars)).array();

the call to array() may not return the desired value, for example:

char[] c = "aaaaaaaaaa".toCharArray();
System.out.println(Arrays.toString(Charset.forName("UTF-8").encode(CharBuffer.wrap(c)).array()));

output:

[97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 0]

As can be seen a zero byte has been added. To avoid this use the following:

char[] c = "aaaaaaaaaa".toCharArray();
ByteBuffer bb = Charset.forName("UTF-8").encode(CharBuffer.wrap(c));
byte[] b = new byte[bb.remaining()];
bb.get(b);
System.out.println(Arrays.toString(b));

output:

[97, 97, 97, 97, 97, 97, 97, 97, 97, 97]

As the answer also alluded to using passwords it might be worth blanking out the array that backs the ByteBuffer (accessed via the array() function):

ByteBuffer bb = Charset.forName("UTF-8").encode(CharBuffer.wrap(c));
byte[] b = new byte[bb.remaining()];
bb.get(b);
blankOutByteArray(bb.array());
System.out.println(Arrays.toString(b));

Solution 4:

private static byte[] charArrayToByteArray(char[] c_array) {
        byte[] b_array = new byte[c_array.length];
        for(int i= 0; i < c_array.length; i++) {
            b_array[i] = (byte)(0xFF & (int)c_array[i]);
        }
        return b_array;
}