Why can't GCC optimize the logical bitwise AND pair in "x && (x & 4242)" to "x & 4242"?

Exactly why should it be able to optimize the code? You're assuming that any transformation that works will be done. That's not at all how optimizers work. They're not Artificial Intelligences. They simply work by parametrically replacing known patterns. E.g. the "Common Subexpression Elimination" scans an expression for common subexpressions, and moves them forwards, if that does not change side effects.

(BTW, CSE shows that optimizers are already quite aware of what code movement is allowed in the possible presence of side effects. They know that you have to be careful with &&. Whether expr && expr can be CSE-optimized or not depends on the side effects of expr.)

So, in summary: which pattern do you think applies here?

You are correct that this appears to be a deficiency, and possibly an outright bug, in the optimizer.

Consider:

bool slow(int x)
{
  return x && (x & 4242);
}

bool slow2(int x)
{
  return (x & 4242) && x;
}

Assembly emitted by GCC 4.8.1 (-O3):

slow:
    xorl    %eax, %eax
    testl   %edi, %edi
    je      .L2
    andl    $4242, %edi
    setne   %al
.L2:
    rep ret

slow2:
    andl    $4242, %edi
    setne   %al
    ret

In other words, slow2 is misnamed.

I have only contributed the occasional patch to GCC, so whether my point of view carries any weight is debatable :-). But it is certainly strange, in my view, for GCC to optimize one of these and not the other. I suggest filing a bug report.

[Update]

Surprisingly small changes appear to make a big difference. For example:

bool slow3(int x)
{
  int y = x & 4242;
  return y && x;
}

...generates "slow" code again. I have no hypothesis for this behavior.

You can experiment with all of these on multiple compilers here.

This is how your code looks in ARM which should make slow run faster when input it 0.

fast(int):
    movw    r3, #4242
    and r3, r0, r3
    adds    r0, r3, #0
    movne   r0, #1
    bx  lr
slow(int):
    cmp r0, #0
    bxeq    lr
    movw    r3, #4242
    and r3, r0, r3
    adds    r0, r3, #0
    movne   r0, #1
    bx  lr

However GCC would optimize very nicely when you start using such trivial functions anyway.

bool foo() {
    return fast(4242) && slow(42);
}

becomes

foo():
    mov r0, #1
    bx  lr

My point is sometimes such code requires more context to be optimized further so why would implementers of optimizers (improvers!) should bother?

Another example:

bool bar(int c) {
  if (fast(c))
    return slow(c);
}

becomes

bar(int):
    movw    r3, #4242
    and r3, r0, r3
    cmp r3, #0
    movne   r0, #1
    bxne    lr
    bx  lr

Why can't GCC optimize the logical bitwise AND pair in "x && (x & 4242)" to "x & 4242"?

Related

Recent Posts