Why do string literals (char*) in C++ have to be constants?
Expanding on Christian Gibbons' answer a bit...
In C, string literals like "Hello World"
are stored in arrays of char
such that they are visible over the lifetime of the program. String literals are supposed to be immutable, and some implementations will store them in a read-only memory segment (such that attempting to modify the literal's contents will trigger a runtime error). Some implementations don't, and attempting to modify the literal's contents may not trigger a runtime error (it may even appear to work as intended). The C language definition leaves the behavior "undefined" so that the compiler is free to handle the situation however it sees fit.
In C++, string literals are stored in arrays of const char
, so that any attempt to modify the literal's contents will trigger a diagnostic at compile time.
As Christian points out, the const
keyword was not originally a part of C. It was, however, originally part of C++, and it makes using string literals a little safer.
Remember that the const
keyword does not mean "store this in read-only memory", it only means "this thing may not be the target of an assignment."
Also remember that, unless it is the operand of the sizeof
or unary *
operators, or is a string literal used to initialize a character array in a declaration, an expression of type "N-element array of T
" will be converted ("decay") to an expression of type "pointer to T
" and the value of the expression will be the address of the first element of the array.
In C++, when you write
const char *str = "Hello, world";
the address of the first character of the string is stored to str
. You can set str
to point to a different string literal:
str = "Goodbye cruel world";
but what you cannot do is modify the contents of the string, something like
str[0] = 'h';
or
strcpy( str, "Something else" );
C didn't initially have the const
keyword, so it would break legacy code if they changed literals to require const
-qualification after introduction of the keyword. C's string-literals are immutable, though, so changing the contents is undefined behavior even if it's not const
-qualified.
C++, on the other hand, was designed with the const
keyword. Initially, C++ did allow for string literals to be assigned to non const
-qualified char *
s presumably for compatibility with existing C code. As of the C++03 standard, however, they decided to deprecate this functionality rather than allowing the dissonance to continue into perpetuity. I would guess the amount of legacy C++ code relying on non-const
qualified char *
s pointing to string literals to be small enough that it was a worthy trade-off.