How does OSX run 64bit Binaries while running on a 32bit Kernel?
I have recently figured out that Mac OS X actually CAN run 64bit (x64) applications even if x86 kernel is loaded. That was shocking for me for a first time.
But then I've realized that it is really weird if system is up and running under x64 compatible CPU can't run x64 applications, no matter what kernel managing the processes. Is it really so hard? Just load damn app into the memory and set up the CPU operation pointer to the first byte, easy as pie!
The one and only barrier to do that, as I could imagine, is some kind of "executable headers". Unfortunately, I am not very comfortable with Windows architecture and binary structure, so I need more explanation here.
De facto UNIX-like OS binary header standard ELF has it's brother ELF64, which (as the document here describes) doesn't have much differences with ELF32, but even though 32bit kernels aren't able to run x64 code. Yes, this program is likely linked to x64 libraries and lets imagine we just copied and pasted them right into /usr/lib64 folder. But I'm pretty sure that doesn't help, why?
And finally, what is so special about Mac OS X kernel so it doesn't worry about program instruction set used? Does Mac OS X have some universal and suitable for both kernels executables header, so it just can load app into the memory and say to CPU "execute right from here, I don't mind what stands for that"?
P.S.: I really thought much about where to place this question: on the stackoverflow.com or superuser.com, and decided to place here, because the topic is likely more OS-specific thing.
The real question would be why some other operating systems can't run 64-bit binaries on a 32-bit kernel. There is no fundamental reason why it wouldn't be possible. The underlying processor architecture supports both a 64-bit instruction set (amd64 a.k.a. x86-64) and a 32-bit instruction set (i386), and there is no restriction on the two being used together (in particular, there isn't a “64-bit mode” that's separate from a “32-bit mode”; there's a single long mode, which allows instructions from both the i386 and the “native” amd64 set).
Running 64-bit applications on a 32-bit kernel does require a little more work inside the kernel, because it must manage 64-bit pointers to user space together with 32-bit pointers to kernel space. Most if not all pointers passed around in the kernel are either known to be to kernel space or known to be to user space, so it isn't a problem if they're different size. The main difficulty is foregoing the possibility of having a universal pointer type that has separate ranges of values for process memory, kernel memory and memory used by various pieces of hardware (including RAM), but this isn't possible in recent 32-bit kernels on PC-class hardware anyway (if you have 4GB or more of RAM, or want to map 2GB of RAM plus 2GB of process space plus kernel memory and more, you need to be able to map more than 32 bits' worth of addresses anyway).
According to the Wikipedia article that you cite, OSX had the ability to run amd64 processes on amd64 processors before it had a 64-bit kernel. Solaris also indifferently mixes i386 and amd64 executables on amd64 processors, regardless of whether the kernel is 32-bit or 64-bit (both are available).
Other operating systems can run i386 processes on a (64-bit) amd64 kernel, but not amd64 processes on a 32-bit kernel, for example Linux, FreeBSD, NetBSD and Windows. Yet other operating systems treat amd64 and i386 as completely different architectures, for example OpenBSD.
I'm not familiar enough with the x86_64 architecture to give the details, but essentially what happens is that the CPU is switched between 64-bit mode and compatibility (32-bit) mode as part of the context switch between the kernel and a userspace program. This is pretty much the same thing that'd be done to run a 32-bit program under a 64-bit kernel, just happening in reverse.
BTW, OS X does not use the ELF binary format, it uses Mach-O binaries. The Mach-O format allows multiarchitecture ("universal") binaries, so programs (and for that matter the kernel) can be supplied in both 32- and 64-bit (and PPC and PPC64 and...), and the OS can choose which version to load (and hence which mode to run it in) at load time. You can use the file
command on a binary to see what format(s) it's in. For example, here's the Chess application shipped with OS X v10.5:
$ file Applications/Chess.app/Contents/MacOS/Chess
Applications/Chess.app/Contents/MacOS/Chess: Mach-O universal binary with 4 architectures
Applications/Chess.app/Contents/MacOS/Chess (for architecture ppc): Mach-O executable ppc
Applications/Chess.app/Contents/MacOS/Chess (for architecture ppc64): Mach-O 64-bit executable ppc64
Applications/Chess.app/Contents/MacOS/Chess (for architecture i386): Mach-O executable i386
Applications/Chess.app/Contents/MacOS/Chess (for architecture x86_64): Mach-O 64-bit executable x86_64
And a note for those doubting that this is possible: OS X supported 64-bit programs starting in v10.4 (with limited API support), but didn't include a 64-bit kernel until v10.6 (and even then, the kernel ran in 32-bit mode by default on most models). See Apple's 64-bit transition guide for details. I'm posting this from a MacBook Pro running 10.6 with a 32-bit kernel (64-bit isn't supported for this particular model), but according to Activity Monitor the only process not running in 64-bit mode is kernel_task.
Macs support running 64-bit apps on top of a 32-bit kernel because a multi-stage plan to do exactly that:
- Mac apps ship as "fat binaries" in "bundles" which allow all four combos of 64/32-bit and Intel/PPC to be part of a single install, which can as simple as one drag-and-drop. The OS runs the appropriate one.
- Macs use PAE to access more than 4GB of RAM when running a 32-bit kernel. Windows does not allow PAE on non-Server versions because of compatibility issues with drivers, which they have a lot more of, including third-party ones.
- Tiger adds a 64-bit ABI (Application Binary Interface) to run 64-bit code on top of the 32-bit kernel, and a 64-bit version of the low-level APIs (Application Programming Interface) for "console" (not GUI) apps.
- Leopard adds 64-bit Cocoa for GUI apps (but not 64-bit Carbon).
- Snow Leopard adds a 64-bit kernel, which is the default on only certain high-end models.
- Lion requires a 64-bit CPU, but still includes the 32-bit kernel. An old Mac with a 64-bit CPU but a GPU that only has 32-bit drivers would have to run the 32-bit kernel, for example.
So OS X supported 64-bit apps as soon as possible, and continue to run the 32-bit kernel as long as possible because of the driver situation. (The bit-ness of the kernel only becomes a factor when trying manage huge amounts of RAM -- page tables take memory too -- and switching to a 64-bit kernel offers some performance benefits.) But Apple is certainly not shy about dropping stuff.
The real question then is why Windows and Linux did not do the same thing. For Windows, consider that their first attempt at Win64 was with Itanium, which was completely different. But the final answer might boil down to what it usually has for the last few decades: compatibility with a bunch of third party programs that didn't do things quite the right way:
OS X’s 64-bit implementation differs significantly from that of Windows, which treats its 32-bit and 64-bit versions as two distinct operating systems stored on different install media. This is done mostly to maintain Windows’ compatibility with older applications – moving or renaming things like the System32 folder would break programs that expected it to be there – and as a result the two are separated to the point that there isn’t even an upgrade path between 32-bit Windows and 64-bit Windows. Because of this, and because Windows applications and drivers usually have distinct 32-bit and 64-bit versions, Windows’ transition to 64-bit has been slightly rockier and slightly more visible to the user.
There's lots of background info on the 64-bit transition on both the Mac side and the Windows side. (Those links are to the last of each series of articles; be sure to go back to the beginning of each.)
I don't know what the story was with Linux, but imagine Linus had a strong opinion about it.