Undefined behavior is one of those aspects of the C and C++ language that can be surprising to programmers coming from other languages (other languages try to hide it better). Basically, it is possible to write C++ programs that do not behave in a predictable way, even though many C++ compilers will not report any errors in the program!

Let's look at a classic example:

#include <iostream>

int main()
{
    char* p = "hello!\n";   // yes I know, deprecated conversion
    p[0] = 'y';
    p[5] = 'w';
    std::cout << p;
}

The variable p points to the string literal "hello!\n", and the two assignments below try to modify that string literal. What does this program do? According to section 2.14.5 paragraph 11 of the C++ standard, it invokes undefined behavior:

The effect of attempting to modify a string literal is undefined.

I can hear people screaming "But wait, I can compile this no problem and get the output yellow" or "What do you mean undefined, string literals are stored in read-only memory, so the first assignment attempt results in a core dump". This is exactly the problem with undefined behavior. Basically, the standard allows anything to happen once you invoke undefined behavior (even nasal demons). If there is a "correct" behavior according to your mental model of the language, that model is simply wrong; The C++ standard has the only vote, period.

Other examples of undefined behavior include accessing an array beyond its bounds, dereferencing the null pointer, accessing objects after their lifetime ended or writing allegedly clever expressions like i++ + ++i.

Section 1.9 of the C++ standard also mentions undefined behavior's two less dangerous brothers, unspecified behavior and implementation-defined behavior:

The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine.

Certain aspects and operations of the abstract machine are described in this International Standard as implementation-defined (for example, sizeof(int)). These constitute the parameters of the abstract machine. Each implementation shall include documentation describing its characteristics and behavior in these respects.

Certain other aspects and operations of the abstract machine are described in this International Standard as unspecified (for example, order of evaluation of arguments to a function). Where possible, this International Standard defines a set of allowable behaviors. These define the nondeterministic aspects of the abstract machine.

Certain other operations are described in this International Standard as undefined (for example, the effect of dereferencing the null pointer). [ Note: this International Standard imposes no requirements on the behavior of programs that contain undefined behavior.end note ]

Specifically, section 1.3.24 states:

Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

What can you do to avoid running into undefined behavior? Basically, you have to read good C++ books by authors who know what they're talking about. Avoid internet tutorials. Avoid bullschildt.


Well, this is basically a straight copy-paste from the standard

3.4.1 1 implementation-defined behavior unspecified behavior where each implementation documents how the choice is made

2 EXAMPLE An example of implementation-defined behavior is the propagation of the high-order bit when a signed integer is shifted right.

3.4.3 1 undefined behavior behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements

2 NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

3 EXAMPLE An example of undefined behavior is the behavior on integer overflow.

3.4.4 1 unspecified behavior use of an unspecified value, or other behavior where this International Standard provides two or more possibilities and imposes no further requirements on which is chosen in any instance

2 EXAMPLE An example of unspecified behavior is the order in which the arguments to a function are evaluated.


Maybe easy wording could be easier for understanding than the rigorous definition of the standards.

implementation-defined behavior
The language says that we have data-types. The compiler vendors specify what sizes shall they use, and provide a documentation of what they did.

undefined behavior
You are doing something wrong. For example, you have a very large value in an int that doesn't fit in char. How do you put that value in char? actually there is no way! Anything could happen, but the most sensible thing would be to take the first byte of that int and put it in char. It is just wrong to do that to assign the first byte, but thats what happens under the hood.

unspecified behavior
Which function of these two is executed first?

void fun(int n, int m);

int fun1() {
    std::cout << "fun1";
    return 1;
}
int fun2() {
    std::cout << "fun2";
    return 2;
}
...
fun(fun1(), fun2()); // which one is executed first?

The language doesn't specify the evaluation, left to right or right to left! So an unspecified behavior may or mayn't result in an undefined behavior, but certainly your program should not produce an unspecified behavior.


@eSKay I think your question is worth editing the answer to clarify more :)

for fun(fun1(), fun2()); isn't the behaviour "implementation defined"? The compiler has to choose one or the other course, after all?

The difference between implementation-defined and unspecified, is that the compiler is supposed to pick a behavior in the first case but it doesn't have to in the second case. For example, an implementation must have one and only one definition of sizeof(int). So, it can't say that sizeof(int) is 4 for some portion of the program and 8 for others. Unlike unspecified behavior, where the compiler can say OK I am gonna evaluate these arguments left-to-right and the next function's arguments are evaluated right-to-left. It can happen in the same program, that's why it is called unspecified. In fact, C++ could have been made easier if some of the unspecified behaviors were specified. Take a look here at Dr. Stroustrup's answer for that:

It is claimed that the difference between what can be produced giving the compiler this freedom and requiring "ordinary left-to-right evaluation" can be significant. I'm unconvinced, but with innumerable compilers "out there" taking advantage of the freedom and some people passionately defending that freedom, a change would be difficult and could take decades to penetrate to the distant corners of the C and C++ worlds. I am disappointed that not all compilers warn against code such as ++i+i++. Similarly, the order of evaluation of arguments is unspecified.

IMO far too many "things" are left undefined, unspecified, that's easy to say and even to give examples of, but hard to fix. It should also be noted that it is not all that difficult to avoid most of the problems and produce portable code.