uint8_t vs unsigned char
What is the advantage of using uint8_t
over unsigned char
in C?
I know that on almost every system uint8_t
is just a typedef for unsigned char
,
so why use it?
It documents your intent - you will be storing small numbers, rather than a character.
Also it looks nicer if you're using other typedefs such as uint16_t
or int32_t
.
Just to be pedantic, some systems may not have an 8 bit type. According to Wikipedia:
An implementation is required to define exact-width integer types for N = 8, 16, 32, or 64 if and only if it has any type that meets the requirements. It is not required to define them for any other N, even if it supports the appropriate types.
So uint8_t
isn't guaranteed to exist, though it will for all platforms where 8 bits = 1 byte. Some embedded platforms may be different, but that's getting very rare. Some systems may define char
types to be 16 bits, in which case there probably won't be an 8-bit type of any kind.
Other than that (minor) issue, @Mark Ransom's answer is the best in my opinion. Use the one that most clearly shows what you're using the data for.
Also, I'm assuming you meant uint8_t
(the standard typedef from C99 provided in the stdint.h
header) rather than uint_8
(not part of any standard).
The whole point is to write implementation-independent code. unsigned char
is not guaranteed to be an 8-bit type. uint8_t
is (if available).
As you said, "almost every system".
char
is probably one of the less likely to change, but once you start using uint16_t
and friends, using uint8_t
blends better, and may even be part of a coding standard.
In my experience there are two places where we want to use uint8_t to mean 8 bits (and uint16_t, etc) and where we can have fields smaller than 8 bits. Both places are where space matters and we often need to look at a raw dump of the data when debugging and need to be able to quickly determine what it represents.
The first is in RF protocols, especially in narrow-band systems. In this environment we may need to pack as much information as we can into a single message. The second is in flash storage where we may have very limited space (such as in embedded systems). In both cases we can use a packed data structure in which the compiler will take care of the packing and unpacking for us:
#pragma pack(1)
typedef struct {
uint8_t flag1:1;
uint8_t flag2:1;
padding1 reserved:6; /* not necessary but makes this struct more readable */
uint32_t sequence_no;
uint8_t data[8];
uint32_t crc32;
} s_mypacket __attribute__((packed));
#pragma pack()
Which method you use depends on your compiler. You may also need to support several different compilers with the same header files. This happens in embedded systems where devices and servers can be completely different - for example you may have an ARM device that communicates with an x86 Linux server.
There are a few caveats with using packed structures. The biggest gotcha is that you must avoid dereferencing the address of a member. On systems with mutibyte aligned words, this can result in a misaligned exception - and a coredump.
Some folks will also worry about performance and argue that using these packed structures will slow down your system. It is true that, behind the scenes, the compiler adds code to access the unaligned data members. You can see that by looking at the assembly code in your IDE.
But since packed structures are most useful for communication and data storage then the data can be extracted into a non-packed representation when working with it in memory. Normally we do not need to be working with the entire data packet in memory anyway.
Here is some relevant discussion:
pragma pack(1) nor __attribute__ ((aligned (1))) works
Is gcc's __attribute__((packed)) / #pragma pack unsafe?
http://solidsmoke.blogspot.ca/2010/07/woes-of-structure-packing-pragma-pack.html