Why does a=(b++) have the same behavior as a=b++?

I am writing a small test app in C with GCC 4.8.4 pre-installed on my Ubuntu 14.04. And I got confused for the fact that the expression a=(b++); behaves in the same way as a=b++; does. The following simple code is used:

#include <stdint.h>
#include <stdio.h>

int main(int argc, char* argv[]){
    uint8_t a1, a2, b1=10, b2=10;
    a1=(b1++);
    a2=b2++;

    printf("a1=%u, a2=%u, b1=%u, b2=%u.\n", a1, a2, b1, b2);

}

The result after gcc compilation is a1=a2=10, while b1=b2=11. However, I expected the parentheses to have b1 incremented before its value is assigned to a1.

Namely, a1 should be 11 while a2 equals 10.

Does anyone get an idea about this issue?


Solution 1:

However, I expected the parentheses to have b1 incremented before its value is assigned to a1

You should not have expected that: placing parentheses around an increment expression does not alter the application of its side effects.

Side effects (in this case, it means writing 11 into b1) get applied some time after retrieving the current value of b1. This could happen before or after the full assignment expression is evaluated completely. That is why a post-increment will remain a post-increment, with or without parentheses around it. If you wanted a pre-increment, place ++ before the variable:

a1 = ++b1;

Solution 2:

Quoting from the C99:6.5.2.4:

The result of the postfix ++ operator is the value of the operand. After the result is obtained, the value of the operand is incremented. (That is, the value 1 of the appropriate type is added to it.) See the discussions of additive operators and compound assignment for information on constraints, types, and conversions and the effects of operations on pointers. The side effect of updating the stored value of the operand shall occur between the previous and the next sequence point.

You can look up the C99: annex C to understand what the valid sequence points are.

In your question, just adding a parentheses doesn't change the sequence points, only the ; character does that.

Or in other words, you can view it like there's a temporary copy of b and the side-effect is original b incremented. But, until a sequence point is reached, all evaluation is done on the temporary copy of b. The temporary copy of b is then discarded, the side effect i.e. increment operation is committed to the storage,when a sequence point is reached.

Solution 3:

Parentheses can be tricky to think about. But they do not mean, "make sure that everything inside happens first".

Suppose we have

a = b + c * d;

The higher precedence of multiplication over addition tells us that the compiler will arrange to multiply c by d, and then add the result to b. If we want the other interpretation, we can use parentheses:

a = (b + c) * d;

But suppose that we have some function calls thrown into the mix. That is, suppose we write

 a = x() + y() * z();

Now, while it's clear that the return value of y() will be multiplied by the return value of z(), can we say anything about the order that x(), y(), and z() will be called in? The answer is, no, we absolutely cannot! If you're at all unsure, I invite you to try it, using x, y, and z functions like this:

int x() { printf("this is x()\n"); return 2; }
int y() { printf("this is y()\n"); return 3; }
int z() { printf("this is z()\n"); return 4; }

The first time I tried this, using the compiler in front of me, I discovered that function x() was called first, even though its result is needed last. When I changed the calling code to

 a = (x() + y()) * z();

the order of the calls to x, y, and z stayed exactly the same, the compiler just arranged to combine their results differently.

Finally, it's important to realize that expressions like i++ do two things: they take i's value and add 1 to it, and then they store the new value back into i. But the store back into i doesn't necessarily happen right away, it can happen later. And the question of "when exactly does the store back into i happen?" is sort of like the question of "when does function x get called?". You can't really tell, it's up to the compiler, it usually doesn't matter, it will differ from compiler to compiler, if you really care, you're going to have to do something else to force the order.

And in any case, remember that the definition of i++ is that it gives the old value of i out to the surrounding expression. That's a pretty absolute rule, and it can not be changed just by adding some parentheses! That's not what parentheses do.

Let's go back to the previous example involving functions x, y, and z. I noticed that function x was called first. Suppose I didn't want that, suppose I wanted functions y and z to be called first. Could I achieve that by writing

x = z() + ((y() * z())?

I could write that, but it doesn't change anything. Remember, the parentheses don't mean "do everything inside first". They do cause the multiplication to happen before the addition, but the compiler was already going to do it that way anyway, based on the higher precedence of multiplication over addition.

Up above I said, "if you really care, you're going to have to do something else to force the order". What you generally have to do is use some temporary variables and some extra statements. (The technical term is "insert some sequence points.") For example, to cause y and z to get called first, I could write

c = y();
d = z();
b = x();
a = b + c * d;

In your case, if you wanted to make sure that the new value of b got assigned to a, you could write

c = b++;
a = b;

But of course that's silly -- if all you want to do is increment b and have its new value assigned to a, that's what prefix ++ is for:

a = ++b;

Solution 4:

Your expectations are completely unfounded.

Parentheses have no direct effect on the order of execution. They don't introduce sequence points into the expression and thus they don't force any side-effects to materialize earlier than they would've materialized without parentheses.

Moreover, by definition, post-increment expression b++ evaluates to the original value of b. This requirement will remain in place regardless of how many pair of parentheses you add around b++. Even if parentheses somehow "forced" an instant increment, the language would still require (((b++))) to evaluate to the old value of b, meaning that a would still be guaranteed to receive the non-incremented value of b.

Parentheses only affects the syntactic grouping between operators and their operands. For example, in your original expression a = b++ one might immediately ask whether the ++ apples to b alone or to the result of a = b. In your case, by adding the parentheses you simply explicitly forced the ++ operator to apply to (to group with) b operand. However, according to the language syntax (and the operator precedence and associativity derived from it), ++ already applies to b, i.e. unary ++ has higher precedence than binary =. Your parentheses did not change anything, it only reiterated the grouping that was already there implicitly. Hence no change in the behavior.

Solution 5:

Parentheses are entirely syntactic. They just group expressions and they are useful if you want to override the precedence or associativity of operators. For example, if you use parentheses here:

a = 2*(b+1);

you mean that the result of b+1 should be doubled, whereas if you omit the parentheses:

a = 2*b+1;

you mean that just b should be doubled and then the result should be incremented. The two syntax trees for these assignments are:

   =                      =
  / \                    / \
 a   *                  a   +
    / \                    / \
   2   +                  *   1
      / \                / \
     b   1              2   b

a = 2*(b+1);            a = 2*b+1;

By using parentheses, you can therefore change the syntax tree that corresponds to your program and (of course) different syntax may correspond to different semantics.

On the other hand, in your program:

a1 = (b1++);
a2 = b2++;

parentheses are redundant because the assignment operator has lower precedence than the postfix increment (++). The two assignments are equivalent; in both cases, the corresponding syntax tree is the following:

    =
   / \
  a   ++ (postfix)
      |
      b

Now that we're done with the syntax, let's go to semantics. This statement means: evaluate b++ and assign the result to a. Evaluating b++ returns the current value of b (which is 10 in your program) and, as a side effect, increments b (which now becomes 11). The returned value (that is, 10) is assigned to a. This is what you observe, and this is the correct behaviour.