New posts in x86

How to print a character in Linux x86 NASM?

linux assembly x86 nasm system-calls

Weird performance effects from nearby dependent stores in a pointer-chasing loop on IvyBridge. Adding an extra load speeds it up?

assembly x86 micro-optimization microbenchmark micro-architecture

What is an assembly-level representation of pushl/popl %esp?

assembly x86 stack-memory instruction-set stack-pointer

How to multiply a register by 37 using only 2 consecutive leal instructions in x86?

assembly x86 x86-64 multiplication strength-reduction

Trial-division code runs 2x faster as 32-bit on Windows than 64-bit on Linux

c++ performance x86 benchmarking 32bit-64bit

Why is std::fill(0) slower than std::fill(1)?

c++ performance x86 compiler-optimization memset

Why is Windows 32-bit called Windows x86 and not Windows x32? [closed]

windows x86 operating-system 32bit-64bit 32-bit

Why not store function parameters in XMM vector registers?

assembly x86 parameter-passing x86-64 calling-convention

How does a mutex lock and unlock functions prevents CPU reordering?

c assembly x86 mutex memory-barriers

What are the segment and offset in real mode memory addressing?

assembly x86 x86-16 real-mode memory-segmentation

x86_64 ASM - maximum bytes for an instruction?

c assembly x86 64-bit x86-64

Why does Windows 7 x64 work faster than an x86 edition on my PC?

windows-7 64-bit x86

How to generate assembly code with clang in Intel syntax?

c++ assembly x86 clang intel

What was the original reason for the design of AT&T assembly syntax?

assembly x86 intel att

What does the dollar sign ($) mean in x86 assembly when calculating string lengths like "$ - label"? [duplicate]

assembly x86 intel-syntax

Is using double faster than float?

c++ performance x86 intel osx-snow-leopard

Which is a better write barrier on x86: lock+addl or xchgl?

assembly x86 memory-barriers

Is there a way to get gcc to output raw binary?

linux gcc command-line linker x86

Argument order to std::min changes compiler output for floating-point

android c++ assembly x86 floating-point

Loop unrolling to achieve maximum throughput with Ivy Bridge and Haswell

c++ x86 intel sse avx