Base64 length calculation?
Solution 1:
Each character is used to represent 6 bits (log2(64) = 6
).
Therefore 4 chars are used to represent 4 * 6 = 24 bits = 3 bytes
.
So you need 4*(n/3)
chars to represent n
bytes, and this needs to be rounded up to a multiple of 4.
The number of unused padding chars resulting from the rounding up to a multiple of 4 will obviously be 0, 1, 2 or 3.
Solution 2:
4 * n / 3
gives unpadded length.
And round up to the nearest multiple of 4 for padding, and as 4 is a power of 2 can use bitwise logical operations.
((4 * n / 3) + 3) & ~3
Solution 3:
For reference, the Base64 encoder's length formula is as follows:
As you said, a Base64 encoder given n
bytes of data will produce a string of 4n/3
Base64 characters. Put another way, every 3 bytes of data will result in 4 Base64 characters. EDIT: A comment correctly points out that my previous graphic did not account for padding; the correct formula for padding is 4(Ceiling(n/3))
.
The Wikipedia article shows exactly how the ASCII string Man
encoded into the Base64 string TWFu
in its example. The input string is 3 bytes, or 24 bits, in size, so the formula correctly predicts the output will be 4 bytes (or 32 bits) long: TWFu
. The process encodes every 6 bits of data into one of the 64 Base64 characters, so the 24-bit input divided by 6 results in 4 Base64 characters.
You ask in a comment what the size of encoding 123456
would be. Keeping in mind that every every character of that string is 1 byte, or 8 bits, in size (assuming ASCII/UTF8 encoding), we are encoding 6 bytes, or 48 bits, of data. According to the equation, we expect the output length to be (6 bytes / 3 bytes) * 4 characters = 8 characters
.
Putting 123456
into a Base64 encoder creates MTIzNDU2
, which is 8 characters long, just as we expected.