To learn assembly - should I start with 32 bit or 64 bit?

Solution 1:

When people refer to 32-bit and 64-bit assembly, they're talking about which instruction set you'll use - they're also sometimes called Ia32 and x64 in the Intel case, which I presume you're asking about. There is a lot more going on in the 64-bit case, so starting with 32-bit is probably good; you just need to make sure you're assembling your program with a 32-bit assembler into a 32-bit binary. Windows will still know how to run it.

What I really recommend for getting started with assembly would be something with a simpler instruction set to get a handle on. Go learn MIPS assembly - the spim simulator is great and easy to use. If you really want to dive straight into the Intel assembly world, write yourself a little C program that calls your assembly routines for you; doing all the setup and teardown for a 'real program' is a big mess, and you won't even be able to get started there. So just write a C wrapper with main() in it, and compile and link that with the object files you get from writing your assembly code.

Please don't get in the habit of writing inline assembly in your C code - it's a code portability nightmare, and there's no reason for it.

You can download all of the Intel 64 and IA-32 Architectures Software Developer's Manuals to get started.

Solution 2:

I started writing assembly in 1977 by taking the long route: first learning basic operations (and, or, xor, not) and octal math before writing programs for the DEC PDP-8/E with OS/8 and 8k of memory. This was in 1977.

Since then I have discovered a few tricks on how to learn assembly for architectures I am unfamiliar with. It's been a few: 8080/8085/Z80, x86, 68000, VAX, 360, HC12, PowerPC and V850. I seldom write stand-alone programs, it's usually functions that are linked with the rest of the system which is usually written in C.

So first of all I must be able to interface to the rest of the software which requires learning the parameter passing, stack layout, creating the stack frame, parameter positions, local variable positions, discarding the stack frame, returned values, return and stack cleanup. The best way to do this is to write a function that calls another function in C and examine the code listing generated by the compiler.

To learn the assembly language itself I write some simple code, seeing what the compiler generates and single-stepping through it in a raw debugger. I have the instruction set manuals close by so I can look up instructions I am unsure of.

A good thing to get to know (in addition to the stack handling mentioned previously) is how the compiler generates machine code given a certain high-level language construct. One such sequence is how indexed arrays/structures are translated into pointers. Another is the basic machine code sequence for loops.

So what is a "raw debugger?" To me it's a debugger that is part of a simple development package and that doesn't try to protect me from the hardware like the Visual debugger(s). In it I can easily switch between source and assembly debugging. It also starts quickly from inside the development IDE. It doesn't have three thousand features, more likely thirty and those will be the ones you use 99.9% of the time. The development package will typically be part of an installer where you click once for license approval, once for approving the default setup (don't you love it when someone has thought about and done that work for you?) and a last time for install.

I have one favorite simple development environment for x86-32 (IA-32) and that is OpenWatcom. You can find it at openwatcom.org.

I am fairly new to x86-64 (AMD64) but the transition seems straightforward (much like when moving from x86-16 to x86-32) with some extra gimmicks such as the extra registers r8 to r15 and that the main registers are 64 bits wide. I just recently ran across a development environment for XP/64, Vista/64 and 7/64 (probably works for the server OS:s as well) and it is called Pelles C (pellesc.org). It is written and maintained by one Pelle Orinius in Sweden and from the few hours I've spent with I can say that it is destined to become my favorite for x86-64. I've tried the Visual Express packages (they install so much junk - do you know how many uninstalls you need to do afterwards? more than 20) and also tried to get gcc from one place to work with an IDE (eclipse or something else) from another.

Once you've come this far and you come across a new architecture you will be able to spend an hour or two looking at the generated listing and after that pretty much know what other architecture it resembles. If the index and loop constructs appear strange you can look over the source code generating them and perhaps also the compiler optimization level.

I think I should warn you that once you get the hang of it you will notice that at desks close by, at the coffee machine, in meetings, in fora and lots of other places there will be individuals waiting to scorn you, make fun of you, throw incomplete quotes at you and give uninformed/incompetent advice because of your interest in assembly. Why they do this I don't know. Perhaps they themselves are failed assembly programmers, perhaps they only know OO (C++, C# and Java) and simply don't have a clue as to what assembler is about. Perhaps someone they "know" (or whom a friend of theirs knows) who is "really good" may have read something in a forum or heard something at a conference and therefore can deliver an absolute truth as to why assembly is a complete waste of time. There are plenty of them here at stackoverflow.

Solution 3:

Get IDA pro. It's the bees knees for working with assembly.

I personally don't see much of a difference between 32-bit and 64-bit. It is not about the bits but the instruction set. When you talk about assembly you talk about instruction sets. Perhaps they are implying that a 32-bit instruction set is better to learn from. However if that is your goal I suggest Donald Knuths books on algorithms -- they teach algorithms in terms of a 7-bit instruction set assembly :D

For portability issues, I suggest that instead of inline assembly you learn how to use compiler intrinsics -- it will be the best optimization for non-embedded optimizations. :D

Solution 4:

but want a better understanding of what's going on at a lower level

If you really want to know everything that's going on at a lower level on x86/x64 processors/systems, I would really recommend starting with the basics, that is, 286/386 real mode code. For example, in 16-bit code you are forced to use memory segmentation which is an important concept to understand. Today's 32-bit and 64-bit operating systems are still started in real mode, then switch to/between the relevant modes.

But if you're interested in application/algorithm development, you might not want to learn all the low-level OS stuff. Instead you can start right off with x86/x64 code, depending on your platform. Note that 32-bit code will also run on 64-bit Windows, but not the other way round.