Why [object doSomething] and not [*object doSomething]?

In Objective-C, why [object doSomething]? Wouldn't it be [*object doSomething] since you're calling a method on the object?, which means you should dereference the pointer?


Solution 1:

The answer harkens back to the C roots of Objective-C. Objective-C was originally written as a compiler pre-processor for C. That is, Objective-C wasn't compiled so much as it was transformed into straight C and then compiled.

Start with the definition of the type id. It is declared as:

typedef struct objc_object {
    Class isa;
} *id;

That is, an id is a pointer to a structure whose first field is of type Class (which, itself, is a pointer to a structure that defines a class). Now, consider NSObject:

@interface NSObject <NSObject> {
    Class   isa;
}

Note that the layout of NSObject and the layout of the type pointed to by id are identical. That is because, in reality, an instance of an Objective-C object is really just a pointer to a structure whose first field -- always a pointer -- points to the class that contains the methods for that instance (along with some other metadata).

When you subclass NSObject and add some instance variables you are, for all intents and purposes, simply creating a new C structure that contains your instance variables as slots in that structure concatenated on the slots for the instance variables for all superclasses. (The modern runtime works slightly differently so that a superclass can have ivars appended without requiring all subclasses to be recompiled).

Now, consider the difference between these two variables:

NSRect foo;
NSRect *bar;

(NSRect being a simple C structure -- no ObjC involved). foo is created with the storage on the stack. It will not survive once the stack frame is closed, but you also don't have to free any memory. bar is a reference to an NSRect structure that was, most likely, created on the heap using malloc().

If you try to say:

NSArray foo;
NSArray *bar;

The compiler will complain about the first, saying something along the lines of stack based objects are not allowed in Objective-C. In other words, all Objective-C objects must be allocated from the heap (more or less-- there are one or two exceptions, but they are comparatively esoteric to this discussion) and, as a result, you always refer to an object through the address of said object on the heap; you are always working with pointers to objects (and the id type really is just a pointer to any old object).

Getting back to the C preprocessor roots of the language, you can translate every method call to an equivalent line of C. For example, the following two lines of code are identical:

[myArray objectAtIndex: 42];
objc_msgSend(myArray, @selector(objectAtIndex:), 42);

Similarly, a method declared like this:

- (id) objectAtIndex: (NSUInteger) a;

Is equivalent to C function declared like this:

id object_at_index(id self, SEL _cmd, NSUInteger a);

And, looking at objc_msgSend(), the first argument is declared to be of type id:

OBJC_EXPORT id objc_msgSend(id self, SEL op, ...);

And that is exactly why you don't use *foo as the target of a method call. Do the translation through the above forms -- the call to [myArray objectAtIndex: 42] is translated to the above C function call which then must call something with the equivalent C function call declaration (all dressed up in method syntax).

The object reference is carried through because it gives the messenger -- objc_msgSend() access to the class to then find the method implementation -- as well as that reference then becoming the first parameter -- the self -- of the method that is eventually executed.

If you really want to go deep, start here. But don't bother until you have fully grokked this.

Solution 2:

You shouldn't really think of these as pointers-to-objects. It's sort of a historical implementation detail that they are pointers, and that you use them like that in message sending syntax (see @bbum's answer). In fact, they are just "object identifiers" (or references). Let's rewind a little bit to see the conceptual rationale.

Objective-C was first proposed and discussed in this book: Object-Oriented Programming: An Evolutionary Approach. It's not immensely practical for modern Cocoa programmers, but the motivations for the language are in there.

Note that in the book all objects are given type id. You don't see the more specific Object *s in the book at all; those are just a leak in the abstraction when we're talking about the "why." Here's what the book says:

Object identifiers must uniquely identify as many objects as may ever coexist in the system at any one time. They are stored in local variables, passed as arguments in message expressions and in function calls, held in instance variables (fields inside objects), and in other kinds of memory structures. In other words, they can be used as fluidly as the built-in types of the base language.

How an object identifier actually identifies the object is an implementation detail for which many choices are plausible. A reasonable choice, certainly one of the simplest, and the one that is used in Objective-C, is to use the physical address of the object in memory as its identifier. Objective-C makes this decision known to C by generating a typedef statement into each file. This defines a new type, id, in terms of another type that C understands already, namely pointers to structures. [...]

An id consumes a fixed amount of space. [...] This space is not the same as the space occupied by the private data in the object itself.

(pp58-59, 2nd ed.)

So the answer to your question is twofold:

  1. The language design specifies that the identifier of an object is not the same as an object itself, and the identifier is the thing that you send messages to, not the object itself.
  2. The design doesn't dictate, but suggests, the implementation that we have now, where pointers to objects are used as identifiers.

The strictly-typed syntax where you say "an object specifically of type NSString" and thus use NSString * is a more modern change, and is basically an implementation choice, equivalent to id.

If this seems like a high-minded response to a question about pointer dereferencing, it's important to keep in mind that objects in Objective-C are "special" per the definition of the language. They are implemented as structures and passed around as pointers to structures, but they are conceptually different.

Solution 3:

Because objc_msgSend() is declared like this:

id objc_msgSend(id theReceiver, SEL theSelector, ...)

Solution 4:

  1. It's not a pointer, it's a reference to an object.
  2. It's not a method, it's a message.

Solution 5:

You never dereference object pointers, period. The fact that they're typed as pointers rather than just "object types" is an artifact of the language's C heritage. It's exactly equivalent to Java's type system, where objects are always accessed through references. You never dereference an object in Java — in fact, you can't. You should not think of them as pointers, because semantically, they aren't. They're just object references.