C++ static member variable and its initialization

Solution 1:

Fundamentally this is because static members must be defined in exactly one translation unit, in order to not violate the One-Definition Rule. If the language were to allow something like:

struct Gizmo
{
  static string name = "Foo";
};

then name would be defined in each translation unit that #includes this header file.

C++ does allow you to define integral static members within the declaration, but you still have to include a definition within a single translation unit, but this is just a shortcut, or syntactic sugar. So, this is allowed:

struct Gizmo
{
  static const int count = 42;
};

So long as a) the expression is const integral or enumeration type, b) the expression can be evaluated at compile-time, and c) there is still a definition somewhere that doesn't violate the one definition rule:

file: gizmo.cpp

#include "gizmo.h"

const int Gizmo::count;

Solution 2:

In C++ since the beginning of times the presence of an initializer was an exclusive attribute of object definition, i.e. a declaration with an initializer is always a definition (almost always).

As you must know, each external object used in C++ program has to be defined once and only once in only one translation unit. Allowing in-class initializers for static objects would immediately go against this convention: the initializers would go into header files (where class definitions usually reside) and thus generate multiple definitions of the same static object (one for each translation unit that includes the header file). This is, of course, unacceptable. For this reason, the declaration approach for static class members is left perfectly "traditional": you only declare it in the header file (i.e. no initializer allowed), and then you define it in a translation unit of your choice (possibly with an initializer).

One exception from this rule was made for const static class members of integral or enum types, because such entries can for Integral Constant Expressions (ICEs). The main idea of ICEs is that they are evaluated at compile time and thus do no depend on definitions of the objects involved. Which is why this exception was possible for integral or enum types. But for other types it would just contradict the basic declaration/definition principles of C++.

Solution 3:

It's because of the way the code is compiled. If you were to initialize it in the class, which often is in the header, every time the header is included you'd get an instance of the static variable. This is definitely not the intent. Having it initialized outside the class gives you the possibility to initialize it in the cpp file.

Solution 4:

Section 9.4.2, Static data members, of the C++ standard states:

If a static data member is of const integral or const enumeration type, its declaration in the class definition can specify a const-initializer which shall be an integral constant expression.

Therefore, it is possible for the value of a static data member to be included "within the class" (by which I presume that you mean within the declaration of the class). However, the type of the static data member must be a const integral or const enumeration type. The reason why the values of static data members of other types cannot be specified within the class declaration is that non-trivial initialization is likely required (that is, a constructor needs to run).

Imagine if the following were legal:

// my_class.hpp
#include <string>

class my_class
{
public:
  static std::string str = "static std::string";
//...

Each object file corresponding to CPP files that include this header would not only have a copy of the storage space for my_class::str (consisting of sizeof(std::string) bytes), but also a "ctor section" that calls the std::string constructor taking a C-string. Each copy of the storage space for my_class::str would be identified by a common label, so a linker could theoretically merge all copies of the storage space into a single one. However, a linker would not be able to isolate all copies of the constructor code within the object files' ctor sections. It would be like asking the linker to remove all of the code to initialize str in the compilation of the following:

std::map<std::string, std::string> map;
std::vector<int> vec;
std::string str = "test";
int c = 99;
my_class mc;
std::string str2 = "test2";

EDIT It is instructive to look at the assembler output of g++ for the following code:

// SO4547660.cpp
#include <string>

class my_class
{
public:
    static std::string str;
};

std::string my_class::str = "static std::string";

The assembly code can be obtained by executing:

g++ -S SO4547660.cpp

Looking through the SO4547660.s file that g++ generates, you can see that there is a lot of code for such a small source file.

__ZN8my_class3strE is the label of the storage space for my_class::str. There is also the assembly source of a __static_initialization_and_destruction_0(int, int) function, which has the label __Z41__static_initialization_and_destruction_0ii. That function is special to g++ but just know that g++ will make sure that it gets called before any non-initializer code gets executed. Notice that the implementation of this function calls __ZNSsC1EPKcRKSaIcE. This is the mangled symbol for std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&).

Going back to the hypothetical example above and using these details, each object file corresponding to a CPP file that includes my_class.hpp would have the label __ZN8my_class3strE for sizeof(std::string) bytes as well as assembly code to call __ZNSsC1EPKcRKSaIcE within its implementation of the __static_initialization_and_destruction_0(int, int) function. The linker can easily merge all occurrences of __ZN8my_class3strE, but it cannot possibly isolate the code that calls __ZNSsC1EPKcRKSaIcE within the object file's implementation of __static_initialization_and_destruction_0(int, int).