What is "bit padding" or "padding bits" exactly?
I often find the term "bit padding" in relation of data types, but don´t understand what it is nor what it does exactly with those.
The gist of it is they are "wasted" space. I say "wasted" because while having padding bits makes the object bigger, it can make working with the object much easier (which means faster) and the small space waste can generate huge performance gains. In some cases it is essential because the CPU can't handle working with objects of that size.
Lets say you have a struct like (all numbers are just an example, different platforms can have different values):
struct foo
{
short a; // 16 bits
char b; // 8 bits
};
and the machine you are working with reads 32 bits of data in a single read operation. Reading a single foo is not a problem since the entire object fits into that 32 bit chunk. What does become a problem is when you have an array. The important thing to remember about arrays is that they are contiguous, there is no space between elements. It's just one object immediately followed by another. So, if you have an array like
foo array[10]{};
With this the first foo
object is in a 32 bit bucket. The next element of the array though will be in the first 32 bit bucket and the second 32 bit bucket. This means that the member a
is in two separate buckets. Some processors can do this (at a cost) and other processors will just crash if you try to do this. To solve both those problems the compiler will add padding bits to the end of foo
to pad out it's size. This means foo actually becomes
struct foo
{
short a; // 16 bits
char b; // 8 bits
char _; // 8 bits of padding
};
And now it is easy for the processor to handle foo
objects by themselves or in an array. It doesn't need to do any extra work and you've only added 8 bits per object. You'd need a lot of objects for that to start to matter on a modern machine.
There is also times where you need padding between members of the type because of unaligned access. Lets say you have
struct bar
{
char c; // 8 bits
int d; // 32 bits
};
Now bar
is 40 bits wide and d
more often then not will be stored in two different buckets again. To fix this the compiler adds padding bits between c
an d
like
struct bar
{
char c; // 8 bits
char _[3]; // 24 bits
int d; // 32 bits
};
and now d
is guaranteed to go into a single 32 bit bucket.
bit padding:
Bit padding is the addition of one or more extra bits to a transmission or storage unit to make it conform to a standard size.
As the definition you posted is already correct, I'll try to explain with an example:
Suppose you have to store data that occupies less than 32 bits but you have 4 byte slots. It is easier to access that data by accessing to each slot, so you just have to complete all the 32 bits. The additional bits needed to complete 'the given space' but which are not part of the data conform the bit padding.
I'm sure there may be better examples of this in multiple contexts. Anybody, feel free to edit and/or complete the answer with new improvements or examples.
Hope this helps!
So imagine you have an 8 bit number, it's an uint8_t
, and its value is set to 4
. This would probably be stored as a = 0000 0100
. Now, let's say you wish to convert this into a 16 bit number. What would happen? You have to assign some values to 'new' bits in this number. How would you assign them? You can't randomly assign zeros or ones, value of original variable will change. Depending on architecture etc. you have to pad value with extra bits. In my case, that would mean additional eight extra zeros be added in front of original MSB (most significant bit), making our number a = 0000 0000 0000 0100
.
Value is still 4, but now you can assign anything in [0, 2^16) range, instead of [0, 2^8) range.