Is ‘int main;’ a valid C/C++ program?

I ask because my compiler seems to think so, even though I don’t.

echo 'int main;' | cc -x c - -Wall
echo 'int main;' | c++ -x c++ - -Wall

Clang issues no warning or error with this, and gcc issues only the meek warning: 'main' is usually a function [-Wmain], but only when compiled as C. Specifying a -std= doesn’t seem to matter.

Otherwise, it compiles and links fine. But on execution, it terminates immediately with SIGBUS (for me).

Reading through the (excellent) answers at What should main() return in C and C++? and a quick grep through the language specs, it would certainly seem to me that a main function is required. But the verbiage from gcc’s -Wmain (‘main’ is usually a function) (and the dearth of errors here) seems to possibly suggest otherwise.

But why? Is there some strange edge-case or “historical” use for this? Anyone know what gives?

My point, I suppose, is that I really think this should be an error in a hosted environment, eh?


Solution 1:

Since the question is double-tagged as C and C++, the reasoning for C++ and C would be different:

  • C++ uses name mangling to help linker distinguish between textually identical symbols of different types, e.g. a global variable xyz and a free-standing global function xyz(int). However, the name main is never mangled.
  • C does not use mangling, so it is possible for a program to confuse linker by providing a symbol of one kind in place of a different symbol, and have the program successfully link.

That is what's going on here: the linker expects to find symbol main, and it does. It "wires" that symbol as if it were a function, because it does not know any better. The portion of runtime library that passes control to main asks linker for main, so linker gives it symbol main, letting the link phase to complete. Of course this fails at runtime, because main is not a function.

Here is another illustration of the same issue:

file x.c:

#include <stdio.h>
int foo(); // <<== main() expects this
int main(){
    printf("%p\n", (void*)&foo);
    return 0;
}

file y.c:

int foo; // <<== external definition supplies a symbol of a wrong kind

compiling:

gcc x.c y.c

This compiles, and it would probably run, but it's undefined behavior, because the type of the symbol promised to the compiler is different from the actual symbol supplied to the linker.

As far as the warning goes, I think it is reasonable: C lets you build libraries that have no main function, so the compiler frees up the name main for other uses if you need to define a variable main for some unknown reason.

Solution 2:

main isn't a reserved word it's just a predefined identifier (like cin, endl, npos...), so you could declare a variable called main, initialize it and then print out its value.

Of course:

  • the warning is useful since this is quite error prone;
  • you can have a source file without the main() function (libraries).

EDIT

Some references:

  • main is not a reserved word (C++11):

    The function main shall not be used within a program. The linkage (3.5) of main is implementation-defined. A program that defines main as deleted or that declares main to be inline, static, or constexpr is ill-formed. The name main is not otherwise reserved. [ Example: member functions, classes and enumerations can be called main, as can entities in other namespaces. — end example ]

    C++11 - [basic.start.main] 3.6.1.3

    [2.11/3] [...] some identifiers are reserved for use by C++ implementations and standard libraries (17.6.4.3.2) and shall not be used otherwise; no diagnostic is required.

    [17.6.4.3.2/1] Certain sets of names and function signatures are always reserved to the implementation:

    • Each name that contains a double underscore __ or begins with an underscore followed by an uppercase letter (2.12) is reserved to the implementation for any use.
    • Each name that begins with an underscore is reserved to the implementation for use as a name in the global namespace.
  • Reserved words in programming languages.

    Reserved words may not be redefined by the programmer, but predefineds can often be overridden in some capacity. This is the case of main: there are scopes in which a declaration using that identifier redefines its meaning.

Solution 3:

Is int main; a valid C/C++ program?

It is not entirely clear what a C/C++ program is.

Is int main; a valid C program?

Yes. A freestanding implementation is allowed to accept such program. main doesn't have to have any special meaning in a freestanding environment.

It is not valid in a hosted environment.

Is int main; a valid C++ program?

Ditto.

Why does it crash?

The program doesn't have to make sense in your environment. In a freestanding environment the program startup and termination, and the meaning of main, are implementation-defined.

Why does the compiler warn me?

The compiler may warn you about whatever it pleases, as long as it doesn't reject conforming programs. On the other hand, warning is all that's required to diagnose a non-conforming program. Since this translation unit cannot be a part of a valid hosted program, a diagnostic message is justified.

Is gcc a freestanding environment, or is it a hosted environment?

Yes.

gcc documents the -ffreestanding compilation flag. Add it, and the warning goes away. You may want to use it when building e.g. kernels or firmware.

g++ doesn't document such flag. Supplying it seems to have no effect on this program. It is probably safe to assume that the environment provided by g++ is hosted. Absence of diagnostic in this case is a bug.

Solution 4:

It is a warning as it is not technically disallowed. The startup code will use the symbol location of "main" and jump to it with the three standard arguments (argc, argv and envp). It does not, and at link time cannot check that it's actually a function, nor even that it has those arguments. This is also why int main(int argc, char **argv) works - the compiler doesn't know about the envp argument and it just happens not to be used, and it is caller-cleanup.

As a joke, you could do something like

int main = 0xCBCBCBCB;

on an x86 machine and, ignoring warnings and similar stuff, it will not just compile but actually work too.

Somebody used a technique similar to this to write an executable (sort of) that runs on multiple architectures directly - http://phrack.org/issues/57/17.html#article . It was also used to win the IOCCC - http://www.ioccc.org/1984/mullender/mullender.c .