What's the difference between a string constant and a string literal?
I'm learning objective-C and Cocoa and have come across this statement:
The Cocoa frameworks expect that global string constants rather than string literals are used for dictionary keys, notification and exception names, and some method parameters that take strings.
I've only worked in higher level languages so have never had to consider the details of strings that much. What's the difference between a string constant and string literal?
Solution 1:
In Objective-C, the syntax @"foo"
is an immutable, literal instance of NSString
. It does not make a constant string from a string literal as Mike assume.
Objective-C compilers typically do intern literal strings within compilation units — that is, they coalesce multiple uses of the same literal string — and it's possible for the linker to do additional interning across the compilation units that are directly linked into a single binary. (Since Cocoa distinguishes between mutable and immutable strings, and literal strings are always also immutable, this can be straightforward and safe.)
Constant strings on the other hand are typically declared and defined using syntax like this:
// MyExample.h - declaration, other code references this
extern NSString * const MyExampleNotification;
// MyExample.m - definition, compiled for other code to reference
NSString * const MyExampleNotification = @"MyExampleNotification";
The point of the syntactic exercise here is that you can make uses of the string efficient by ensuring that there's only one instance of that string in use even across multiple frameworks (shared libraries) in the same address space. (The placement of the const
keyword matters; it guarantees that the pointer itself is guaranteed to be constant.)
While burning memory isn't as big a deal as it may have been in the days of 25MHz 68030 workstations with 8MB of RAM, comparing strings for equality can take time. Ensuring that most of the time strings that are equal will also be pointer-equal helps.
Say, for example, you want to subscribe to notifications from an object by name. If you use non-constant strings for the names, the NSNotificationCenter
posting the notification could wind up doing a lot of byte-by-byte string comparisons when determining who is interested in it. If most of these comparisons are short-circuited because the strings being compared have the same pointer, that can be a big win.
Solution 2:
Some definitions
A literal is a value, which is immutable by definition. eg: 10
A constant is a read-only variable or pointer. eg: const int age = 10;
A string literal is a expression like @""
. The compiler will replace this with an instance of NSString
.
A string constant is a read-only pointer to NSString
. eg: NSString *const name = @"John";
Some comments on the last line:
- That's a constant pointer, not a constant object1.
objc_sendMsg
2 doesn't care if you qualify the object withconst
. If you want an immutable object, you have to code that immutability inside the object3. - All
@""
expressions are indeed immutable. They are replaced4 at compile time with instances ofNSConstantString
, which is a specialized subclass ofNSString
with a fixed memory layout5. This also explains whyNSString
is the only object that can be initialized at compile time6.
A constant string would be const NSString* name = @"John";
which is equivalent to NSString const* name= @"John";
. Here, both syntax and programmer intention are wrong: const <object>
is ignored, and the NSString
instance (NSConstantString
) was already immutable.
1 The keyword const
applies applies to whatever is immediately to its left. If there is nothing to its left, it applies to whatever is immediately to its right.
2 This is the function that the runtime uses to send all messages in Objective-C, and therefore what you can use to change the state of an object.
3 Example: in const NSMutableArray *array = [NSMutableArray new]; [array removeAllObjects];
const doesn't prevent the last statement.
4 The LLVM code that rewrites the expression is RewriteModernObjC::RewriteObjCStringLiteral
in RewriteModernObjC.cpp.
5 To see the NSConstantString
definition, cmd+click it in Xcode.
6 Creating compile time constants for other classes would be easy but it would require the compiler to use a specialized subclass. This would break compatibility with older Objective-C versions.
Back to your quote
The Cocoa frameworks expect that global string constants rather than string literals are used for dictionary keys, notification and exception names, and some method parameters that take strings. You should always prefer string constants over string literals when you have a choice. By using string constants, you enlist the help of the compiler to check your spelling and thus avoid runtime errors.
It says that literals are error prone. But it doesn't say that they are also slower. Compare:
// string literal
[dic objectForKey:@"a"];
// string constant
NSString *const a = @"a";
[dic objectForKey:a];
In the second case I'm using keys with const pointers, so instead [a isEqualToString:b]
, I can do (a==b)
. The implementation of isEqualToString:
compares the hash and then runs the C function strcmp
, so it is slower than comparing the pointers directly. Which is why constant strings are better: they are faster to compare and less prone to errors.
If you also want your constant string to be global, do it like this:
// header
extern NSString *const name;
// implementation
NSString *const name = @"john";
Solution 3:
Let's use C++, since my Objective C is totally non-existent.
If you stash a string into a constant variable:
const std::string mystring = "my string";
Now when you call methods, you use my_string, you're using a string constant:
someMethod(mystring);
Or, you can call those methods with the string literal directly:
someMethod("my string");
The reason, presumably, that they encourage you to use string constants is because Objective C doesn't do "interning"; that is, when you use the same string literal in several places, it's actually a different pointer pointing to a separate copy of the string.
For dictionary keys, this makes a huge difference, because if I can see the two pointers are pointing to the same thing, that's much cheaper than having to do a whole string comparison to make sure the strings have equal value.
Edit: Mike, in C# strings are immutable, and literal strings with identical values all end pointing at the same string value. I imagine that's true for other languages as well that have immutable strings. In Ruby, which has mutable strings, they offer a new data-type: symbols ("foo" vs. :foo, where the former is a mutable string, and the latter is an immutable identifier often used for Hash keys).