Why use LDR over MOV (or vice versa) in ARM assembly?

I'm looking through this tutorial: http://www.cl.cam.ac.uk/freshers/raspberrypi/tutorials/os/ok01.html

The first line of assembly is:

ldr r0,=0x20200000

the second is:

mov r1,#1

I thought ldr was for loading values from memory into registers. But it seems the = means the 0x20200000 is a value not a memory address. Both lines seem to be loading the absolute values.


Solution 1:

It is a trick/shortcut. say for example

ldr r0,=main

what would happen is the assembler would allocate a data word, near the instruction but outside the instruction path

ldr r0,main_addr
...
b somewhere
main_addr: .data main

Now expand that trick to constants/immediates, esp those that cannot fit into a move immediate instruction:

top:
add r1,r2,r3
ldr r0,=0x12345678
eor r1,r2,r3
eor r1,r2,r3
b top

assemble then disassemble

00000000 <top>:
   0:   e0821003    add r1, r2, r3
   4:   e59f0008    ldr r0, [pc, #8]    ; 14 <top+0x14>
   8:   e0221003    eor r1, r2, r3
   c:   e0221003    eor r1, r2, r3
  10:   eafffffa    b   0 <top>
  14:   12345678    eorsne  r5, r4, #125829120  ; 0x7800000

and you see the assembler has added the data word for you and changed the ldr into a pc relative for you.

now if you use an immediate that does fit in a mov instruction, then depending on the assembler perhaps, certainly with the gnu as I am using, it turned it into a mov for me

top:
add r1,r2,r3
ldr r0,=0x12345678
ldr r5,=1
mov r6,#1
eor r1,r2,r3
eor r1,r2,r3
b top


00000000 <top>:
   0:   e0821003    add r1, r2, r3
   4:   e59f0010    ldr r0, [pc, #16]   ; 1c <top+0x1c>
   8:   e3a05001    mov r5, #1
   c:   e3a06001    mov r6, #1
  10:   e0221003    eor r1, r2, r3
  14:   e0221003    eor r1, r2, r3
  18:   eafffff8    b   0 <top>
  1c:   12345678    eorsne  r5, r4, #125829120  ; 0x7800000

So it is basically a typing shortcut, understand that you are giving the assembler the power to find a place to stick the constant, which it usually does a good job, sometimes complains, not sure if I have seen it fail to do it safely. Sometimes you need a .ltorg or .pool in the code to encourage the assembler to find a place.

Solution 2:

A shorter response, just from someone that is more closer to your level, hope it helps: in ARM, instructions have 32bits. Some bits are used to identify the operation, some for the operands, and, in the case of the MOV instruction, some are available for an immediate value (#1, for example).

As you can see here (page 33), there are only 12 bits available for the immediate value. Instead of using each bit as the number (that ranges from 0 to 2^12-1~4095), the instruction computes the immediate number by rotating right (ROR) the first 8 bits by two times the amount specified in the last 4 bits. That is,immediate = first 8 bits ROR 2*(last four bits).

This way, we can achieve a much wider range of numbers than just 0 to 4095 (see page 34 for a brief summary of possible immediates). Keep in mind, though, that with 12 bits, there are still only 4096 possible values that can be specified.

Just in case that our number cannot be converted into an instruction like the previous one (257 cannot be expressed as 8 bits rotated two times any 4 bits), then, we have to use LDR r0, =257

In this case, the compiler saves the number 257 in memory, close to the program code, so it can be addressed relative to the PC, and loads it from memory, just as dwelch explained in detail.

Note: If you follow that tutorial, then when you try to 'make' with mov r0, #257 you will get an error, and you have to manually try ldr r0, =257.

Solution 3:

As good as the other answers are, I think I might be able to simplify the answer.

ldr = LoaD Register

mov = MOVe

Both effectively do the same thing but in different ways.

The difference is a lot like the difference between

#define CONST 5

and

int CONST = 5;

in C language.

mov is really fast because it has the accompanying value directly stored as a part of the instruction (in the 12 bit format described in the answer above). It has some limitations due to the way it stores the value. Why? Because

  • 12 bits is not sufficient for storing huge numbers like the 32-bit memory addresses.
  • First 8 bits ROR 2 * (Last 4 bits) cannot represent just any number, even in the 12 bit range.

ldr, on the other hand, is versatile (mainly due to compiler optimizations). It works like this (as shown in the disassembled routine)

  • If the value can be represented in 12-bit & First 8 bits ROR 2 * (Last 4 bits) format then the compiler changes it to a mov instruction accompanying the value.

  • Otherwise, the value is kept as a data, loaded into RAM, at a location. And it is loaded into the required register by accessing from memory using offset from the program counter.

I hope it helped.