Base64 length calculation?

Solution 1:

Each character is used to represent 6 bits (log2(64) = 6).

Therefore 4 chars are used to represent 4 * 6 = 24 bits = 3 bytes.

So you need 4*(n/3) chars to represent n bytes, and this needs to be rounded up to a multiple of 4.

The number of unused padding chars resulting from the rounding up to a multiple of 4 will obviously be 0, 1, 2 or 3.

Solution 2:

4 * n / 3 gives unpadded length.

And round up to the nearest multiple of 4 for padding, and as 4 is a power of 2 can use bitwise logical operations.

((4 * n / 3) + 3) & ~3

Solution 3:

For reference, the Base64 encoder's length formula is as follows:

Base64 encoder's length formula

As you said, a Base64 encoder given n bytes of data will produce a string of 4n/3 Base64 characters. Put another way, every 3 bytes of data will result in 4 Base64 characters. EDIT: A comment correctly points out that my previous graphic did not account for padding; the correct formula for padding is 4(Ceiling(n/3)).

The Wikipedia article shows exactly how the ASCII string Man encoded into the Base64 string TWFu in its example. The input string is 3 bytes, or 24 bits, in size, so the formula correctly predicts the output will be 4 bytes (or 32 bits) long: TWFu. The process encodes every 6 bits of data into one of the 64 Base64 characters, so the 24-bit input divided by 6 results in 4 Base64 characters.

You ask in a comment what the size of encoding 123456 would be. Keeping in mind that every every character of that string is 1 byte, or 8 bits, in size (assuming ASCII/UTF8 encoding), we are encoding 6 bytes, or 48 bits, of data. According to the equation, we expect the output length to be (6 bytes / 3 bytes) * 4 characters = 8 characters.

Putting 123456 into a Base64 encoder creates MTIzNDU2, which is 8 characters long, just as we expected.