On 32-bit CPUs, is an 'integer' type more efficient than a 'short' type?
On a 32-bit CPU, an integer is 4 bytes and a short integer is 2 bytes. If I am writing a C/C++ application that uses many numeric values that will always fit within the provided range of a short integer, is it more efficient to use 4 byte integers or 2 byte integers?
I have heard it suggested that 4 byte integers are more efficient as this fits the bandwidth of the bus from memory to the CPU. However, if I am adding together two short integers, would the CPU package both values in a single pass in parallel (thus spanning the 4 byte bandwidth of the bus)?
Solution 1:
If you have a large array of numbers, then go with the smallest size that works. It will be more efficient to work with an array of 16 bit shorts than 32 bit ints since you get twice the cache density. The cost of any sign extension the CPU has to do to work with 16 bit values in 32 bit registers is trivially negligible compared to the cost of a cache miss.
If you are simply using member variables in classes mixed with other data types then it is less clear cut as the padding requirements will likely remove any space saving benefit of the 16 bit values.
Solution 2:
Yes, you should definitely use a 32 bit integer on a 32 bit CPU, otherwise it may end up masking off the unused bits (i.e., it will always do the maths in 32 bits, then convert the answer to 16 bits)
It won't do two 16 bit operations at once for you, but if you write the code yourself and you're sure it won't overflow, you can do it yourself.
Edit: I should add that it also depends somewhat on your definition of "efficient". While it will be able to do 32-bit operations more quickly, you will of course use twice as much memory.
If these are being used for intermediate calculations in an inner loop somewhere, then use 32-bit. If, however, you're reading this from disk, or even if you just have to pay for a cache miss, it may still work out better to use 16-bit integers. As with all optimizations, there's only one way to know: profile it.
Solution 3:
If you're using "many" integer values, the bottleneck in your processing is liable to be bandwidth to memory. 16 bit integers pack more tightly into the data cache, and would therefore be a performance win.
If you are number crunching on a very large amount of data, you should read What Every Programmer Should Know About Memory by Ulrich Drepper. Concentrate on chapter 6, about maximizing the efficiency of the data cache.