Operand size prefix in 16-bit mode

I'm trying to understand GAS's behavior of .code16.

From the manual, it seems in 16-bit section, for 32-bit operands or instructions, a 66H operand override prefix will be produced for the instruction encoding. Does that mean

.code16
movw %eax, %ebx

is legal in such mode? Then the code cannot run on 16-bit processor?


These are legal instructions for 80386+.

Starting with the 80386 we can use operandsize- and addresssize- override prefixes. Those prefixes can be used in combination with the 16 bit address mode and with the 32-bit address mode.

Additional it can be used with the real address mode and with the protected mode and the virtual 86 mode. Those prefixes reverse the default operand size and/or the address size for one instruction in the code segment. The default operand size and the address size is specified by the D flag in the code-segment descriptor (or if there is no GDT/LDT, then we become the 16 bit address mode after the POST-process of the bios is done.)

With the 16 bit address mode we have to add those prefixes, if we want to use 32-bit operands and/or 32-bit addresses. Without those prefixes we can only use 16-bit addresses/operands in the 16-bit address mode.

With the 32-bit address mode we have to leave out those prefixes from our code, if we want to use 32-bit operands and/or 32-bit addresses. And if we add those prefixes to our code, then we can use 16-bit addresses/operands in the 32-bit address mode.

Intel:

Instruction prefixes can be used to override the default operand size and address size of a code segment. These prefixes can be used in real-address mode as well as in protected mode and virtual-8086 mode. An operand-size or address-size prefix only changes the size for the duration of the instruction.

The following two instruction prefixes allow mixing of 32-bit and 16-bit operations within one segment:

  • The operand-size prefix (66H)
  • The address-size prefix (67H)

These prefixes reverse the default size selected by the D flag in the code-segment descriptor. For example, the processor can interpret the (MOV mem, reg) instruction in any of four ways:

  • In a 32-bit code segment:

    • Moves 32 bits from a 32-bit register to memory using a 32-bit effective address.
    • If preceded by an operand-size prefix, moves 16 bits from a 16-bit register to memory using a 32-bit effective address.
    • If preceded by an address-size prefix, moves 32 bits from a 32-bit register to memory using a 16-bit effective address.
    • If preceded by both an address-size prefix and an operand-size prefix, moves 16 bits from a 16-bit register to memory using a 16-bit effective address.
  • In a 16-bit code segment:

    • Moves 16 bits from a 16-bit register to memory using a 16-bit effective address.
    • If preceded by an operand-size prefix, moves 32 bits from a 32-bit register to memory using a 16-bit effective address.
    • If preceded by an address-size prefix, moves 16 bits from a 16-bit register to memory using a 32-bit effective address.
    • If preceded by both an address-size prefix and an operand-size prefix, moves 32 bits from a 32-bit register to memory using a 32-bit effective address.

The previous examples show that any instruction can generate any combination of operand size and address size regardless of whether the instruction is in a 16- or 32-bit segment. The choice of the 16- or 32-bit default for a code segment is normally based on the following criteria:

  • Performance — Always use 32-bit code segments when possible. They run much faster than 16-bit code segments on P6 family processors, and somewhat faster on earlier IA-32 processors.
  • The operating system the code segment will be running on — If the operating system is a 16-bit operating system, it may not support 32-bit program modules.
  • Mode of operation — If the code segment is being designed to run in real-address mode, virtual-8086 mode, or SMM, it must be a 16-bit code segment.
  • Backward compatibility to earlier IA-32 processors — If a code segment must be able to run on an Intel 8086 or Intel 286 processor, it must be a 16-bit code segment.

The D flag in a code-segment descriptor determines the default operand-size and address-size for the instructions of a code segment. (In real-address mode and virtual-8086 mode, which do not use segment descriptors, the default is 16 bits.) A code segment with its D flag set is a 32-bit segment; a code segment with its D flag clear is a 16-bit segment.

Executable code segment. The flag is called the D flag and it indicates the default length for effective addresses and operands referenced by instructions in the segment. If the flag is set, 32-bit addresses and 32-bit or 8-bit operands are assumed; if it is clear, 16-bit addresses and 16-bit or 8-bit operands are assumed. The instruction prefix 66H can be used to select an operand size other than the default, and the prefix 67H can be used select an address size other than the default.

The 32-bit operand prefix can be used in real-address mode programs to execute the 32-bit forms of instructions. This prefix also allows real-address mode programs to use the processor’s 32-bit general-purpose registers. The 32-bit address prefix can be used in real-address mode programs, allowing 32-bit offsets.

The IA-32 processors beginning with the Intel386 processor can generate 32-bit offsets using an address override prefix; however, in real-address mode, the value of a 32-bit offset may not exceed FFFFH without causing an exception.

Assembler Usage:

If a code segment that is going to run in real-address mode is defined, it must be set to a USE 16 attribute. If a 32-bit operand is used in an instruction in this code segment (for example, MOV EAX, EBX), the assembler automatically generates an operand prefix for the instruction that forces the processor to execute a 32-bit operation, even though its default code-segment attribute is 16-bit.

The 32-bit operand prefix allows a real-address mode program to use the 32-bit general-purpose registers (EAX, EBX, ECX, EDX, ESP, EBP, ESI, and EDI).

When moving data in 32-bit mode between a segment register and a 32-bit general-purpose register, the Pentium Pro processor does not require the use of a 16-bit operand size prefix; however, some assemblers do require this prefix. The processor assumes that the 16 least-significant bits of the general-purpose register are the destination or source operand. When moving a value from a segment selector to a 32-bit register, the processor fills the two high-order bytes of the register with zeros.

AMD:

3.3.2. 32-Bit vs. 16-Bit Address and Operand Sizes

The processor can be configured for 32-bit or 16-bit address and operand sizes. With 32-bit address and operand sizes, the maximum linear address or segment offset is FFFFFFFFH (2^32-1), and operand sizes are typically 8 bits or 32 bits. With 16-bit address and operand sizes, the maximum linear address or segment offset is FFFFH (2^16-1), and operand sizes are typically 8 bits or 16 bits.

When using 32-bit addressing, a logical address (or far pointer) consists of a 16-bit segment selector and a 32-bit offset; when using 16-bit addressing, it consists of a 16-bit segment selector and a 16-bit offset. Instruction prefixes allow temporary overrides of the default address and/or operand sizes from within a program.

When operating in protected mode, the segment descriptor for the currently executing code segment defines the default address and operand size. A segment descriptor is a system data structure not normally visible to application code. Assembler directives allow the default addressing and operand size to be chosen for a program. The assembler and other tools then set up the segment descriptor for the code segment appropriately.

When operating in real-address mode, the default addressing and operand size is 16 bits. An address-size override can be used in real-address mode to enable 32-bit addressing; however, the maximum allowable 32-bit linear address is still 000FFFFFH (2^20-1).

3.6. OPERAND-SIZE AND ADDRESS-SIZE ATTRIBUTES

When the processor is executing in protected mode, every code segment has a default operand-size attribute and address-size attribute. These attributes are selected with the D (default size) flag in the segment descriptor for the code segment (see Chapter 3, Protected-Mode Memory Management, in the Intel Architecture Software Developer’s Manual, Volume 3). When the D flag is set, the 32-bit operand-size and address-size attributes are selected; when the flag is clear, the 16-bit size attributes are selected. When the processor is executing in real-address mode, virtual-8086 mode, or SMM (System-Management-Mode), the default operand-size and address-size attributes are always 16 bits.

The operand-size attribute selects the sizes of operands that instructions operate on. When the 16-bit operand-size attribute is in force, operands can generally be either 8 bits or 16 bits, and when the 32-bit operand-size attribute is in force, operands can generally be 8 bits or 32 bits.

The address-size attribute selects the sizes of addresses used to address memory: 16 bits or 32 bits. When the 16-bit address-size attribute is in force, segment offsets and displacements are 16 bits. This restriction limits the size of a segment that can be addressed to 64 KBytes. When the 32-bit address-size attribute is in force, segment offsets and displacements are 32 bits, allowing segments of up to 4 GBytes to be addressed.

The default operand-size attribute and/or address-size attribute can be overridden for a particular instruction by adding an operand-size and/or address-size prefix to an instruction (see “Instruction Prefixes” in Chapter 2 of the Intel Architecture Software Developer’s Manual, Volume 3). The effect of this prefix applies only to the instruction it is attached to.

Table 3-1 shows effective operand size and address size (when executing in protected mode) depending on the settings of the D flag and the operand-size and address-size prefixes.