how arm-thumb instruction set's blx instruction support 4MB range
In the original Thumb instruction set, the BL
instruction comprised two 16 bit instructions encoded such:
1111 HOOO OOOO OOOO BL <label>
| |
| \.. long branch and link offset high/low
\............... low/high offset
0 -- offset high
1 -- offset low
The first of the two instructions must have the H bit set to 0. The 11 bit offset is shifted to the left by 12, added to PC
and placed into the LR
register.
LR = PC + (offset << 12)
The second of the two instructions must have the H bit set to 1. The 11 bit offset is shifted to the left by 1, added to the contents of the LR
register, and used as a branch target. The LR
register is set to the return address.
temp = next instruction address
PC = LR + (offset << 1)
LR = temp | 1
With ARMv5T, a Thumb encoding of the BLX
instruction was added, allowing Thumb code to call into ARM code. This was done by defining a new thumb bit in the second half* of the BL
instruction.
111T 1OOO OOOO OOOO BL/BLX <label> (second half)
| |
| \.. long branch link exchange offset low
\................. thumb bit
0 -- BLX is encoded
1 -- BL is encoded
The operation of BLX
is similar to the second half of the BL
instruction, but the offset must be even. The function is called in ARM state instead of Thumb state.
temp = next instruction address
PC = (LR + (offset << 1)) & 0xfffffffc
LR = temp | 1
CSPR T bit = 0
Note that with a total of 22 immediate bits giving an offset in halfwords, the observed branch offset of ±4 MiB is achieved.
Putting the two halves together, we can also see BL
and BLX
as 32 bit instructions with an encoding like this:
1111 0OOO OOOO OOOO 111T 1OOO OOOO OOOO BL/BLX <label>
| | |
| | \.. 22 bit offset (low half)
| \................. thumb bit
\....................... 22 bit offset (high half)
In Thumb2, this scheme was extended. BL
and BLX
became proper 32 bit instructions and their halves must be given consecutively.† Some bits of the second instruction word were defined to extend the branch offset to ±16 MiB.
1111 0SOO OOOO OOOO 11AT BOOO OOOO OOOO BL/BLX <label>
| | || | |
| | || | \.. 21 bit offset (low half)
| | || \............... additional bit J2
| | |\................. thumb bit
| | \.................. additional bit J1
| \....................... 21 bit offset (high half)
\................................... sign bit
If the thumb bit is set, the BL
instruction is encoded. If it is clear, the BLX
instruction is encoded. In the latter case, the 21 bit offset must be even. The branch offset is then computed as follows:
I1 = !(J1 ^ S)
I2 = !(J2 ^ S)
imm32 = (S ? 0xffff << 24 : 0) | (I1 << 23) | (I2 << 22) | (imm21 << 1)
temp = next instruction address
PC = LR + offset
LR = temp | 1
if thumb bit clear
CSPR T bit = 0
While the scheme to encode the additional offset bits seems convoluted at first, it is just the simplest way to encode two additional bits into the branch offset while being compatible with the existing encoding of the BL
and BLX
instructions.
See the ARM Architecture Reference Manual, ARMv7-A and ARMv7-R edition, the ARM7TDMI Data Sheet, and the ARM Architecture Reference Manual for ARMv5 for further reading.
* The related encoding 1110 0OOO OOOO OOOO
encodes the 16 bit unconditional branch instruction B <label>
.
† Before Thumb2, the two parts of a BL
or BLX
instruction were independent instructions and could be given interspersed with other instructions or even individually, though it was strongly recommended to issue them in consecutive order. An interrupt could also occur between the two halves of a BL
or BLX
instruction, making the temporary contents of the LR
register observable. On Thumb2 targets including ARMv6-M, this is no longer possible and BL
and BLX
behave as a 32 bit instruction.