Why IA32 does not allow memory to memory mov? [duplicate]

The answer involves a fuller understanding of RAM. Simply stated, RAM can only be in two states, read mode or write mode. If you wish to copy one byte in ram to another location, you must have a temporary storage area outside of RAM as you switch from read to write.

It is certainly possible for the architecture to have such a RAM to RAM instruction, but it would be a high level instruction that in microcode would translate to copying of data from RAM to a register then back to RAM. Alternatively, it could be possible to extend the RAM controller to have such a temporary register just for this copying of data, but it wouldnt provide much of a benefit for the added complexity of CPU/Hardware interaction.

EDIT: It is worth noting that recent advancements such as Hybrid Memory Cube and High Bandwidth Memory are achitectures in which the RAM topology has become more like PCI-e and direct RAM to RAM transfers are now possible, but that is due to the support logic for the technologies, not the RAM itself. In the CPU architecture, this would be in the form of huge blocks of RAM at a time, like DMA, and not in the form of a single instruction, plus the CPU cache behaves like traditional RAM so the architecture would have to abstract it as per my original explanation

EDIT2: Per @PeterCordes comment, my original understanding was not entirely correct; x86 does in fact have a few memory to memory instructions. The real reason they are not available for most instructions (such as movl and movw) is to keep instruction encoding complexity low, but they could have implemented them. However, the basic idea in my original answer, that there is a temporary storage location outside of RAM in the form of a latch or register, is correct, but the idea that this is the reason why these instructions don't exist is not. Even older chips from the 1970s such as the 6502 and the 8086 have memory to memory instructions, and you could easily perform operations such as INC directly on a RAM location. This was accomplished by latching the memory fetch directly to the ALU and back out to memory again without going through a register used by the instruction set.


ia32 is x86, and x86 is evolution from the intel 8086 (iAPX 86). It was small and cheap chip based on 8-bit instruction sets, and had no "mov" with two explicit memory operands.

Wikipedia's author gives such explanation about instruction encoding of 8086:

Due to a compact encoding inspired by 8-bit processors, most instructions are one-address or two-address operations, which means that the result is stored in one of the operands. At most one of the operands can be in memory, but this memory operand can also be the destination, while the other operand, the source, can be either register or immediate. A single memory location can also often be used as both source and destination which, among other factors, further contributed to a code density comparable to (and often better than) most eight-bit machines at the time.

There were some CISCs with memory-memory instructions (single instruction to operate on two memory operands). The lecture https://www.cis.upenn.edu/~milom/cis501-Fall05/lectures/02_isa.pdf says that VAX can encode memory-memory instructions:

DEC VAX (Virtual Address eXtension to PDP-11): 1977

  • • Variable length instructions: 1-321 bytes!!!
  • • 14 GPRs + PC + stack-pointer + condition codes
  • • Data sizes: 8, 16, 32, 64, 128 bit, decimal, string
  • Memory-memory instructions for all data sizes
  • • Special insns: crc, insque, polyf, and a cast of hundreds

This is OpenBSD memcpy source for VAX (instruction set manual http://h20565.www2.hpe.com/hpsc/doc/public/display?docId=emr_na-c04623178):

https://es.osdn.jp/projects/openbsd-octeon/scm/git/openbsd-octeon/blobs/master/src/sys/lib/libkern/arch/vax/memcpy.S

         movq    8(ap),r1        /* r1 = src, r2 = length */
         movl    4(ap),r3        /* r3 = dst */
... 
 1:      /* move forward */
         cmpl    r2,r0
         bgtru   3f              /* stupid movc3 limitation */
         movc3   r2,(r1),(r3)    /* move it all */

The "movc3" instruction here has two memory operands, which addresses are stored in registers.

x86 has several "string" instruction which will do memory-memory operations (*s, especially movs - http://x86.renejeschke.de/html/file_module_x86_id_203.html), but this instruction will use predefined registers SI & DI as addresses (implicit operands), and two memory operands still can't be encoded in x86.


As far as I know, as a general rule in this architecture, only one memory access per instruction is allowed. This is because dealing with two memory accesses per instruction would complicate the processor's execution pipeline.