Newbetuts
.
New posts in x86
What is the best way to set a register to zero in x86 assembly: xor, mov or and?
performance
assembly
optimization
x86
micro-optimization
Replacing a 32-bit loop counter with 64-bit introduces crazy performance deviations with _mm_popcnt_u64 on Intel CPUs
c++
performance
assembly
x86
compiler-optimization
SSE instructions: which CPUs can do atomic 16B memory operations?
concurrency
x86
thread-safety
atomic
sse
Why are elementwise additions much faster in separate loops than in a combined loop?
c++
performance
x86
vectorization
compiler-optimization
Prev