basic assembly not working on Mac (x86_64+Lion)?

Solution 1:

First, you are using old 32-bit Linux kernel calling convention on Mac OS X - this absolutely doesn't work.

Second, syscalls in Mac OS X are structured in a different way - they all have a leading class identifier and a syscall number. The class can be Mach, BSD or something else (see here in the XNU source) and is shifted 24 bits to the left. Normal BSD syscalls have class 2 and thus begin from 0x2000000. Syscalls in class 0 are invalid.

As per §A.2.1 of the SysV AMD64 ABI, also followed by Mac OS X, syscall id (together with its class on XNU!) goes to %rax (or to %eax as the high 32 bits are unused on XNU). The fist argument goes in %rdi. Next goes to %rsi. And so on. %rcx is used by the kernel and its value is destroyed and that's why all functions in libc.dyld save it into %r10 before making syscalls (similarly to the kernel_trap macro from syscall_sw.h).

Third, code sections in Mach-O binaries are called __text and not .text as in Linux ELF and also reside in the __TEXT segment, collectively referred as (__TEXT,__text) (nasm automatically translates .text as appropriate if Mach-O is selected as target object type) - see the Mac OS X ABI Mach-O File Format Reference. Even if you get the assembly instructions right, putting them in the wrong segment/section leads to bus error. You can either use the .section __TEXT,__text directive (see here for directive syntax) or you can also use the (simpler) .text directive, or you can drop it altogether since it is assumed if no -n option was supplied to as (see the manpage of as).

Fourth, the default entry point for the Mach-O ld is called start (although, as you've already figured it out, it can be changed via the -e linker option).

Given all the above you should modify your assembler source to read as follows:

; You could also add one of the following directives for completeness
; .text
; or
; .section __TEXT,__text

.globl start
start:
    movl $0x2000001, %eax
    movl $32, %edi
    syscall

Here it is, working as expected:

$ as -o exit.o exit.s; ld -o exit exit.o
$ ./exit; echo $?
32

Solution 2:

Adding more explanation on the magic number. I made the same mistake by applying the Linux syscall number to my NASM.

From the xnu kernel sources in osfmk/mach/i386/syscall_sw.h (search SYSCALL_CLASS_SHIFT).

/*
 * Syscall classes for 64-bit system call entry.
 * For 64-bit users, the 32-bit syscall number is partitioned
 * with the high-order bits representing the class and low-order
 * bits being the syscall number within that class.
 * The high-order 32-bits of the 64-bit syscall number are unused.
 * All system classes enter the kernel via the syscall instruction.

Syscalls are partitioned:

#define SYSCALL_CLASS_NONE  0   /* Invalid */
#define SYSCALL_CLASS_MACH  1   /* Mach */  
#define SYSCALL_CLASS_UNIX  2   /* Unix/BSD */
#define SYSCALL_CLASS_MDEP  3   /* Machine-dependent */
#define SYSCALL_CLASS_DIAG  4   /* Diagnostics */

As we can see, the tag for BSD system calls is 2. So that magic number 0x2000000 is constructed as:

// 2 << 24
#define SYSCALL_CONSTRUCT_UNIX(syscall_number) \
            ((SYSCALL_CLASS_UNIX << SYSCALL_CLASS_SHIFT) | \
             (SYSCALL_NUMBER_MASK & (syscall_number)))

Why it uses BSD tag in the end, probably Apple switches from mach kernel to BSD kernel. Historical reason.

Inspired by the original answer.