How does an assembly instruction turn into voltage changes on the CPU?

I've been working in C and CPython for the past 3 - 5 years. Consider that my base of knowledge here.

If I were to use an assembly instruction such as MOV AL, 61h to a processor that supported it, what exactly is inside the processor that interprets this code and dispatches it as voltage signals? How would such a simple instruction likely be carried out?

Assembly even feels like a high level language when I try to think of the multitude of steps contained in MOV AL, 61h or even XOR EAX, EBX.

EDIT: I read a few comments asking why I put this as embedded when the x86-family is not common in embedded systems. Welcome to my own ignorance. Now I figure that if I'm ignorant about this, there are likely others ignorant of it as well.

It was difficult for me to pick a favorite answer considering the effort you all put into your answers, but I felt compelled to make a decision. No hurt feelings, fellas.

I often find that the more I learn about computers the less I realize I actually know. Thank you for opening my mind to microcode and transistor logic!

EDIT #2: Thanks to this thread, I have just comprehended why XOR EAX, EAX is faster than MOV EAX, 0h. :)


I recently started reading Charles Petzold book titled Code, which so far covers exactly the kinds of things I assume you are curious about. But I have not gotten all the way through so thumb through the book first before buying/borrowing.

This is my relatively short answer, not Petzolds...and hopefully in line with what you were curios about.

You have heard of the transistor I assume. The original way to use a transistor was for things like a transistor radio. it is an amplifier basically, take the tiny little radio signal floating in air and feed it into the input of the transistor which opens or closes the flow of current on a circuit next to it. And you wire that circuit with higher power, so you can take a very small signal, amplify it and feed it into a speaker for example and listen to the radio station (there is more to it isolating the frequency and keeping the transistor balanced, but you get the idea I hope).

Now that the transistor exists that lead to was a way to use a transistor as a switch, like a light switch. The radio is like a dimmer light switch you can turn it to anywhere from all the way on to all the way off. A non-dimmer light switch is either all on or all off, there is some magic place in the middle of the switch where it changes over. We use transistors the same way in digital electronics. Take the output of one transistor and feed it into another transistors input. The output of the first is certainly not a small signal like the radio wave, it forces the second transistor all the way on or all the way off. that leads to the concept of TTL or transistor-transistor logic. Basically you have one transistor that drives a high voltage or lets call it a 1, and on that sinks a zero voltage, lets call that a 0. And you arrange the inputs with other electronics so that you can create AND gates (if both inputs are a 1 then the output is a 1), OR gates (if either one or the other input is a 1 then the output is a one). Inverters, NAND, gates, NOR gates (an or with an inverter) etc. There used to be a TTL handbook and you could buy 8 or so pin chips that had one or two or four of some kind of gate (NAND, NOR, AND, etc) functions inside, two inputs and an output for each. Now we dont need those it is cheaper to create programmable logic or dedicated chips with many millions of transistors. But we still think in terms of AND, OR, and NOT gates for hardware design. (usually more like nand and nor).

I dont know what they teach now but the concept is the same, for memory a flip flop can be thought of as two of these TTL pairs (NANDS) tied together with the output of one going to the input of the other. Lets leave it at that. That is basically a single bit in what we call SRAM, or static ram. sram takes basically 4 transistors. Dram or dynamic ram the memory sticks you put in your computer yourself take one transistor per bit, so for starters you can see why dram is the thing you buy gigabytes worth of. Sram bits remember what you set them to so long as the power doesnt go out. Dram starts to forget what you told it as soon as you tell it, basically dram uses the transistor in yet a third different way, there is some capacitance (as in capacitor, wont get into that here) that is like a tiny rechargeable battery, as soon as you charge it and unplug the charger it starts to drain. Think of a row of glasses on a shelf with little holes in each glass, these are your dram bits, you want some of them to be ones so you have an assistant fill up the glasses you want to be a one. That assistant has to constantly fill up the pitcher and go down the row and keep the "one" bit glasses full enough with water, and let the "zero" bit glasses remain empty. So that at any time you want to see what your data is you can look over and read the ones and zeros by looking for water levels that are definitely above the middle being a one and levels definitely below the middle being a zero.. So even with the power on, if the assistant is not able to keep the glasses full enough to tell a one from a zero they will eventually all look like zeros and drain out. Its the trade off for more bits per chip. So short story here is that outside the processor we use dram for our bulk memory, and there is assistant logic that takes care of keeping the ones a one and zeros a zero. But inside the chip, the AX register and DS registers for example keep your data using flip flops or sram. And for every bit you know about like the bits in the AX register, there are likely hundreds or thousands or more that are used to get the bits into and out of that AX register.

You know that processors run at some clock speed, these days around 2 gigahertz or two billion clocks per second. Think of the clock, which is generated by a crystal, another topic, but the logic sees that clock as a voltage that goes high and zero high and zero at this clock rate 2ghz or whatever (gameboy advances are 17mhz, old ipods around 75mhz, original ibm pc 4.77mhz).

So transistors used as switches allow us to take voltage and turn it into the ones and zeros we are familiar with both as hardware engineers and software engineers, and go so far as to give us AND, OR, and NOT logic functions. And we have these magic crystals that allow us to get an accurate oscillation of voltage.

So we can now do things like say, if the clock is a one, and my state variable says I am in the fetch instruction state, then I need to switch some gates so that the address of the instruction I want, which is in the program counter, goes out on the memory bus, so that the memory logic can give me my instruction for MOV AL,61h. You can look this up in a x86 manual, and find that some of those opcode bits say this is a mov operation and the target is the lower 8 bits of the EAX register, and the source of the mov is an immediate value which means it is in the memory location after this instruction. So we need to save that instruction/opcode somewhere and fetch the next memory location on the next clock cycle. so now we have saved the mov al, immediate and we have the value 61h read from memory and we can switch some transistor logic so that bit 0 of that 61h is stored in the bit 0 flipflop of al and bit 1 to bit 1, etc.

How does all that happen you ask? Think about a python function performing some math formula. you start at the top of the program with some inputs to the formula that come in as variables, you have individual steps through the program that might add a constant here or call the square root function from a library, etc. And at the bottom you return the answer. Hardware logic is done the same way, and today programming languages are used one of which looks a lot like C. The main difference is your hardware functions might have hundreds or thousands of inputs and the output is a single bit. On every clock cycle, bit 0 of the AL register is being computed with a huge algorithm depending how far out you want to look. Think about that square root function you called for your math operation, that function itself is one of these some inputs produce an output, and it may call other functions maybe a multiply or divide. So you likely have a bit somewhere that you can think of as the last step before bit 0 of the AL register and its function is: if clock is one then AL[0] = AL_next[0]; else AL[0] = AL[0]; But there is a higher function that contains that next al bit computed from other inputs, and a higher function and a higher function and much of these are created by the compiler in the same way that your three lines of python can turn into hundreds or thousands of lines of assembler. A few lines of HDL can become hundreds or thousands or more transistors. hardware folks dont normally look at the lowest level formula for a particular bit to find out all the possible inputs and all the possible ANDs and ORs and NOTs that it takes to compute any more than you probably inspect the assembler generated by your programs. but you could if you wanted to.

A note on microcoding, most processors do not use microcoding. you get into it with the x86 for example because it was a fine instruction set for its day but on the surface struggles to keep up with modern times. other instruction sets do not need microcoding and use logic directly in the way I described above. You can think of microcoding as a different processor using a different instruction set/assembly language that is emulating the instruction set that you see on the surface. Not as complicated as when you try to emulate windows on a mac or linux on windows, etc. The microcoding layer is designed specifically for the job, you may think of there only being the four registers AX, BX, CX, DX, but there are many more inside. And naturally that one assembly program somehow can get executed on multiple execution paths in one core or multiple cores. Just like the processor in your alarm clock or washing machine, the microcode program is simple and small and debugged and burned into the hardware hopefully never needing a firmware update. At least ideally. but like your ipod or phone for example you sometimes do want a bug fix or whatever and there is a way to upgrade your processor (the bios or other software loads a patch on boot). Say you open the battery compartment to your TV remote control or calculator, you might see a hole where you can see some bare metal contacts in a row, maybe three or 5 or many. For some remotes and calculators if you really wanted to you could reprogram it, update the firmware. Normally not though, ideally that remote is perfect or perfect enough to outlive the TV set. Microcoding provides the ability to get the very complicated product (millions, hundreds of millions of transistors) on the market and fix the big and fixable bugs in the field down the road. Imagine a 200 million line python program your team wrote in say 18 months and having to deliver it or the company will fail to the competitions product. Same kind of thing except only a small portion of that code you can update in the field the rest has to remain carved in stone. for the alarm clock or toaster, if there is a bug or the thing needs help you just throw it out and get another.

If you dig through wikipedia or just google stuff you can look at the instruction sets and machine language for things like the 6502, z80, 8080, and other processors. There may be 8 registers and 250 instructions and you can get a feel from the number of transistors that that 250 assembly instructions is still a very high level language compared to the sequence of logic gates it takes to compute each bit in a flip flop per clock cycle. You are correct in that assumption. Except for the microcoded processors, this low level logic is not re-programmable in any way, you have to fix the hardware bugs with software (for hardware that is or going to be delivered and not scrapped).

Look up that Petzold book, he does an excellent job of explaining stuff, far superior to anything I could ever write.


Edit: Here is a example of CPU (6502) that has been simulated using python/javascript AT THE TRANSISTOR LEVEL http://visual6502.org You can put your code in to see how it to do what it does.

Edit: Excellent 10 000m Level View : Soul of a New Machine - Tracy Kidder

I had great difficulty envisioning this until I did microcoding. Then it all made sense (abstractly). This is a complex topic but in a very very high level view.

Essentially think of it like this.

A cpu instruction is essentially a set of charges stored in electrical circuits that make up memory. There is circuity that cause those charges to be transferred to the inside of the CPU from the memory. Once inside the CPU the charges are set as input to the wiring of the CPU's circuitry. This is essentially a mathematical function that will cause more electrical output to occur, and the cycle continues.

Modern cpus are far far more complex but and include many layers of microcoding, but the principle remains the same. Memory is a set of charges. There is circuitry to move the charges and other circuitry to carry out function with will result in other charges (output) to fed to memory or other circuitry to carry out other functions.

To understand how the memory works you need to understand logic gates and how they are created from multiple transistors. This leads to the discovery that hardware and software are equivalent in in the sense that the essentially perform functions in the mathematical sense.


This is a question that requires more than an answer on StackOverflow to explain.

To learn about this all the way from the most basic electronic components up to basic machine code, read The Art of Electronics, by Horowitz and Hill. To learn more about computer architecture, read Computer Organization and Design by Patterson and Hennessey. If you want to get into more advanced topics, read Computer Architecture: A Quantitative Approach, by Hennessey and Patterson.

By the way, The Art of Electronics also has a companion lab manual. If you have the time and resources available, I would highly recommend doing the labs; I actually took the classes taught by Tom Hayes, in which we built a variety of analog and digital circuits, culminating in building a computer from a 68k chip, some RAM, some PLDs, and some discrete components. You would enter machine code directly into RAM using a hexadecimal keypad; it was a blast, and a great way to get hands on experience at the very lowest levels of a computer.


Explaining the whole system in any detail is impossible to do without entire books, but here is a very high level overview of a simplistic computer:

  • At the lowest level there is physics and materials (e.g. transistors made from doped silicon).
  • Using physics and materials, you can derive the NAND logic gate.
  • Using the NAND gate, you can derive all the other basic logic gates (AND, OR, XOR, NOT, etc), or for efficiency build them directly from transistors, including versions with more than 2 inputs.
  • Using the basic logic gates, you can derive more complicated circuits such as the adder, the multiplexer, and so forth.
  • Also using the basic logic gates, you can derive stateful digital circuit elements such as the flip flop, the clock, and so forth.
  • Using your more complicated stately circuits, you can derive higher-level pieces like counters, memory, registers, the arithmetic-logic-unit, etc.
  • Now you just have to glue your high level pieces together such that:
    • A value comes out of memory
    • The value is interpreted as an instruction by dispatching it to the appropriate place (eg. the ALU or memory) using multiplexers and etc. (Basic instruction types are read-from-memory-into-register, write-from-register-into-memory, perform-operation-on-registers, and jump-to-instruction-on-condition.)
    • The process repeats with the next instruction

To understand how an assembly instruction causes a voltage change, you simply need to understand how each of those levels is represented by the level below. For example, an ADD instruction will cause the value of two registers to propagate to the ALU, which has circuits that compute all of the logic operations. Then a multiplexer on the other side, being fed the ADD signal from the instruction, selects the desired result, which propagates back to one of the registers.


This is a big question, and at most universities there's an entire semester-long class to answer it. So, rather than give you some terribly butchered summary in this little box, instead I'll direct you to the textbook that has the whole truth: Computer Organization and Design: The Hardware/Software Interface by Patterson and Hennessey.