Newbetuts
.
New posts in compiler-optimization
Replacing a 32-bit loop counter with 64-bit introduces crazy performance deviations with _mm_popcnt_u64 on Intel CPUs
c++
performance
assembly
x86
compiler-optimization
Why doesn't GCC optimize a*a*a*a*a*a to (a*a*a)*(a*a*a)?
gcc
assembly
floating-point
compiler-optimization
fast-math
Why are elementwise additions much faster in separate loops than in a combined loop?
c++
performance
x86
vectorization
compiler-optimization
Prev