Newbetuts
.
New posts in assembly
Displaying numbers with DOS
assembly
dos
x86-16
integer-division
signed-integer
Test whether a register is zero with CMP reg,0 vs OR reg,reg?
assembly
optimization
x86
micro-optimization
Why does mulss take only 3 cycles on Haswell, different from Agner's instruction tables? (Unrolling FP loops with multiple accumulators)
c
assembly
x86
sse
micro-optimization
Can x86's MOV really be "free"? Why can't I reproduce this at all?
c
assembly
x86
cpu-architecture
micro-optimization
How exactly do partial registers on Haswell/Skylake perform? Writing AL seems to have a false dependency on RAX, and AH is inconsistent
assembly
x86
intel
cpu-architecture
micro-optimization
How do I achieve the theoretical maximum of 4 FLOPs per cycle?
c++
assembly
x86-64
cpu-architecture
flops
Boot loader doesn't jump to kernel code
assembly
virtualbox
nasm
x86-16
bootloader
Why should EDX be 0 before using the DIV instruction?
assembly
x86
integer-division
Why doesn't GCC use partial registers?
assembly
gcc
x86
x86-64
cpu-architecture
Assembling 32-bit binaries on a 64-bit system (GNU toolchain)
linux
assembly
build
x86
att
Why are loops always compiled into "do...while" style (tail jump)?
performance
loops
assembly
optimization
micro-optimization
Can num++ be atomic for 'int num'?
c++
c
multithreading
assembly
atomic
Why does GCC use multiplication by a strange number in implementing integer division?
c
gcc
assembly
x86-64
integer-division
Why does C++ code for testing the Collatz conjecture run faster than hand-written assembly?
c++
performance
assembly
optimization
x86
What's the purpose of the LEA instruction?
assembly
x86
x86-64
x86-16
Referencing the contents of a memory location. (x86 addressing modes)
assembly
x86
masm
addressing-mode
Fastest way to do horizontal SSE vector sum (or other reduction)
assembly
optimization
floating-point
sse
simd
How do I print an integer in Assembly Level Programming without printf from the c library?
assembly
x86
integer
output
nasm
Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?
c
assembly
x86-64
compiler-optimization
llvm-codegen
Micro fusion and addressing modes
assembly
x86
intel
cpu-architecture
iaca
Prev
Next