Is there any downside to using negative array indices in C?

It's been fairly well-asked here if negative indices are allowed in C, but I'm curious of there is any performance downside to using this technique frequently. For example, does it break the compiler's ability to use the base+offset indexing on some hardware platform somewhere, or confuse the optimizer, etc.

I ask because I have a lot of decoding routines with the signature of

decode_something(char *buffer, int buffer_size, int bit_position) {
    ...
    if (((bit_position + need_bits) >> 3) < buffer_size) {
       x= buffer[bit_position >> 3] ...
       bit_position += need_bits;
    ...
}

and I realized I can greatly simplify all that code if I use a pointer to the end of the buffer:

decode_something(char *buf_limit, int bits_remaining) {
    ...
    if (need_bits <= bits_remaining) {
        x= buf_limit[ -((bits_remaining+7) >> 3) ];
        bits_remaining -= need_bits;
    ...
}

(well, actually this example doesn't show how it gets "much" simpler, but you can extrapolate to all the cases where calling this function or saving the parse state only require two variables instead of three.)

Now I'm considering using this pattern throughout an entire library, but I wanted to know if there's some reason not to do that.


Solution 1:

For example, does it break the compiler's ability to use the base+offset indexing on some hardware platform somewhere, or confuse the optimizer, etc.

No. C language standard clearly define it:

From C11 Standard#6.5.6p8:

8 When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i-n-th elements of the array object, provided they exist. ....

Just, in case, if you are not aware:

From C11 Standard#6.5.2.1p2, the definition of subscript operator:

The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))).

Your another question:

but I'm curious of there is any performance downside to using this technique frequently.

It boils down to question - There are two number a and b, out of these two operations - a + b and a - b, which operation is faster than other.

It's not the language that dictate time for operations like addition, subtraction etc., but it's the underlying processor. If you are really interested in it, you have to dig down to the instruction set of underlying processor and compare the latency of respective instructions etc..