What does the symbol \0 mean in a string-literal?
Consider following code:
char str[] = "Hello\0";
What is the length of str array, and with how much 0s it is ending?
sizeof str
is 7 - five bytes for the "Hello" text, plus the explicit NUL terminator, plus the implicit NUL terminator.
strlen(str)
is 5 - the five "Hello" bytes only.
The key here is that the implicit nul terminator is always added - even if the string literal just happens to end with \0
. Of course, strlen
just stops at the first \0
- it can't tell the difference.
There is one exception to the implicit NUL terminator rule - if you explicitly specify the array size, the string will be truncated to fit:
char str[6] = "Hello\0"; // strlen(str) = 5, sizeof(str) = 6 (with one NUL)
char str[7] = "Hello\0"; // strlen(str) = 5, sizeof(str) = 7 (with two NULs)
char str[8] = "Hello\0"; // strlen(str) = 5, sizeof(str) = 8 (with three NULs per C99 6.7.8.21)
This is, however, rarely useful, and prone to miscalculating the string length and ending up with an unterminated string. It is also forbidden in C++.
The length of the array is 7, the NUL character \0
still counts as a character and the string is still terminated with an implicit \0
See this link to see a working example
Note that had you declared str
as char str[6]= "Hello\0";
the length would be 6 because the implicit NUL is only added if it can fit (which it can't in this example.)
§ 6.7.8/p14
An array of character type may be initialized by a character string literal, optionally enclosed in braces. Sucessive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.
Examples
char str[] = "Hello\0"; /* sizeof == 7, Explicit + Implicit NUL */
char str[5]= "Hello\0"; /* sizeof == 5, str is "Hello" with no NUL (no longer a C-string, just an array of char). This may trigger compiler warning */
char str[6]= "Hello\0"; /* sizeof == 6, Explicit NUL only */
char str[7]= "Hello\0"; /* sizeof == 7, Explicit + Implicit NUL */
char str[8]= "Hello\0"; /* sizeof == 8, Explicit + two Implicit NUL */
Specifically, I want to mention one situation, by which you may confuse.
What is the difference between "\0" and ""?
The answer is that "\0"
represents in array is {0 0}
and ""
is {0}
.
Because "\0"
is still a string literal and it will also add "\0"
at the end of it. And ""
is empty but also add "\0"
.
Understanding of this will help you understand "\0"
deeply.
Banging my usual drum solo of JUST TRY IT, here's how you can answer questions like that in the future:
$ cat junk.c
#include <stdio.h>
char* string = "Hello\0";
int main(int argv, char** argc)
{
printf("-->%s<--\n", string);
}
$ gcc -S junk.c
$ cat junk.s
... eliding the unnecessary parts ...
.LC0:
.string "Hello"
.string ""
...
.LC1:
.string "-->%s<--\n"
...
Note here how the string I used for printf is just "-->%s<---\n"
while the global string is in two parts: "Hello"
and ""
. The GNU assembler also terminates strings with an implicit NUL
character, so the fact that the first string (.LC0) is in those two parts indicates that there are two NUL
s. The string is thus 7 bytes long. Generally if you really want to know what your compiler is doing with a certain hunk of code, isolate it in a dummy example like this and see what it's doing using -S
(for GNU -- MSVC has a flag too for assembler output but I don't know it off-hand). You'll learn a lot about how your code works (or fails to work as the case may be) and you'll get an answer quickly that is 100% guaranteed to match the tools and environment you're working in.