Difference between word addressable and byte addressable
Can someone explain what's the different between Word
and Byte
addressable? How is it related to memory size etc.?
- A byte is a memory unit for storage
- A memory chip is full of such bytes.
Memory units are addressable. That is the only way we can use memory.
In reality, memory is only byte addressable. It means:
- A binary address always points to
a single
byte only. - A word is just
a group of
bytes –2
,4
,8
depending upon the data bussize of
the CPU.
To understand the memory operation fully, you must be familiar with the various registers of the CPU and the memory ports of the RAM. I assume you know their meaning:
- MAR(memory address register)
- MDR(memory data register)
- PC(program counter register)
- MBR(memory buffer register)
RAM has two
kinds of memory ports:
-
32-bits
for data/addresses -
8-bit
for OPCODE.
Suppose CPU wants to read a
word (say 4
bytes) from the address xyz
onwards. CPU would put the address on the MAR, sends a memory read signal to the memory controller chip. On receiving the address and read signal, memory controller would connect the data bus to 32-bit
port and 4 bytes
starting from the address xyz
would flow out of the port to the MDR.
If the CPU wants to fetch the next instruction, it would put the address onto the PC register and sends a fetch signal to the memory controller. On receiving the address and fetch signal, memory controller would connect the data bus to 8
-bit port and a single
byte long opcode located at the address received would flow out of the RAM into the CPU's MDR.
So that is what it means when we say a certain register is memory addressable or byte addressable. Now what will happen when you put, say decimal 2
in binary
on the MAR with an intention to read the word 2
, not (byte no 2
)?
Word no 2
means bytes 4
, 5
, 6
, 7
for 32-bit
machine. In real physical memory is byte addressable only. So there is a trick to handle word addressing.
When MAR is placed on the address bus, its 32
-bits do not map onto the 32
address lines(0-31
respectively). Instead, MAR bit 0
is wired to address bus line 2
, MAR bit 1
is wired to address bus line 3
and so on. The upper 2 bits
of MAR are discarded since they are only needed for word addresses above 2^32
none of which are legal for our 32 bit
machine.
Using this mapping, when MAR is 1
, address 4
is put on the bus, when MAR is 2
, address 8
is put on the bus and so forth.
It is a bit difficult in the beginning to understand. I learnt it from Andrew Tanenbaums's structured computer organisation.
This image should make it easy to understand: http://i.stack.imgur.com/rpB7N.png
Simply put,
• In the byte addressing scheme, the first word starts at address 0, and the second word starts at address 4.
• In the word addressing scheme, all bytes of the first word are located in address 0, and all bytes of the second word are located in address 1.
The advantage of byte-addressability are clear when we consider applications that process data one byte at a time. Access of a single byte in a byte-addressable system requires only the issuing of a single address. In a 16–bit word addressable system, it is necessary first to compute the address of the word containing the byte, fetch that word, and then extract the byte from the two-byte word. Although the processes for byte extraction are well understood, they are less efficient than directly accessing the byte. For this reason, many modern machines are byte addressable.
Addressability is the size of a unit of memory that has its own address. It's also the smallest chunk of memory that you can modify without affecting its neighbours.
For example: a machine where bytes are the normal 8 bits, and the word-size = 4 bytes. If it's a word-addressable machine, there's no such thing as the address of the second byte of an int
. Dealing with strings (e.g. an array like char str[]
) becomes inconvenient, because you still store characters packed together. Modifying just str[1]
means loading the word that contains it, doing some shift/and/or operations to apply the change, then doing a word store.
Note that this is different from a machine that doesn't allow unaligned word load/stores (where the low 2 bits of a word address have to be 0). Such machines usually have a byte load/store instruction. We're talking about machines without even that.
CPU addresses might actually still include the low bits, but require them to always be zero (or ignore them). However, after checking that they're zero, the could be discarded, so the rest of the memory system only sees the word address, where two adjacent words have an address that differs by 1 (not 4). However, on a 16-bit CPU where a register can only hold 64k different addresses, you wouldn't likely do this. Each separate CPU address would refer to a different 2 bytes of memory, instead of discarding the low bit. 2B word-addressable memory would let you address 128kiB of memory, instead of just 64kiB with byte-addressable memory.
Fun fact: ARM used to use the low 2 bits of an address as a shuffle control for unaligned word loads. (But it always had byte load/store instructions.)
See also:
- https://en.wikipedia.org/wiki/Word-addressable
- https://en.wikipedia.org/wiki/Byte_addressing
Note that bit-addressable memory could exist, but doesn't. 8-bit bytes are nearly universally standard now. (Ancient computers sometimes had larger bytes, see the history section of wikipedia's Byte article.)