Difference between string.empty and string[0] == '\0'

Suppose we have a string

std::string str; // some value is assigned

What is the difference between str.empty() and str[0] == '\0'?


Solution 1:

C++11 and beyond

string_variable[0] is required to return the null character if the string is empty. That way there is no undefined behavior and the comparison still works if the string is truly empty. However you could have a string that starts with a null character ("\0Hi there") which returns true even though it is not empty. If you really want to know if it's empty, use empty().


Pre-C++11

The difference is that if the string is empty then string_variable[0] has undefined behavior; There is no index 0 unless the string is const-qualified. If the string is const qualified then it will return a null character.

string_variable.empty() on the other hand returns true if the string is empty, and false if it is not; the behavior won't be undefined.


Summary

empty() is meant to check whether the string/container is empty or not. It works on all containers that provide it and using empty clearly states your intent - which means a lot to people reading your code (including you).

Solution 2:

Since C++11 it is guaranteed that str[str.size()] == '\0'. This means that if a string is empty, then str[0] == '\0'. But a C++ string has an explicit length field, meaning it can contain embedded null characters.

E.g. for std::string str("\0ab", 3), str[0] == '\0' but str.empty() is false.

Besides, str.empty() is more readable than str[0] == '\0'.

Solution 3:

Other answers here are 100% correct. I just want to add three more notes:

empty is generic (every STL container implements this function) while operator [] with size_t only works with string objects and array-like containers. when dealing with generic STL code, empty is preferred.

also, empty is pretty much self explanatory while =='\0' is not very much. when it's 2AM and you debug your code, would you prefer see if(str.empty()) or if(str[0] == '\0')? if only functionality matters, we would all write in vanilla assembly.

there is also a performance penalty involved. empty is usually implemented by comparing the size member of the string to zero, which is very cheap, easy to inline etc. comparing against the first character might be more heavy. first of all, since all strings implement short string optimization, the program first has to ask if the string is in "short mode" or "long mode". branching - worse performance. if the string is long, dereferencing it may be costly if the string was "ignored" for some time and the dereference itself may cause a cache-fault which is costly.