What is happening here in this C++ code?
Can anyone please explain what is going in this C++ code. It compiles and executes fine on Linux.
#include <iostream>
using namespace std;
int main = ( cout << "Hello world!\n", 195 );
The number "195" is the code of RET instruction on x86.
The C++ compiler (gcc in my case) is unable to recognize that "main" wasn't declared as a function. The compiler only sees that there is the "main" symbol, and presumes that it refers to a function.
The C++ code
int main = ( cout << "Hello world!\n", 195 );
is initializing a variable at file-scope. This initialization code is executed before the C/C++ environment calls main(), but after it initializes the "cout" variable. The initialization prints "Hello, world!\n", and sets the value of variable "main" to 195. After all initialization is done, the C/C++ environment makes a call to "main". The program returns immediately from this call because we put a RET instruction (code 195) at the address of "main".
Sample GDB output:
$ gdb ./a
(gdb) break _fini
Breakpoint 1 at 0x8048704
(gdb) print main
$1 = 0
(gdb) disass &main
Dump of assembler code for function main:
0x0804a0b4 <+0>: add %al,(%eax)
0x0804a0b6 <+2>: add %al,(%eax)
End of assembler dump.
(gdb) run
Starting program: /home/atom/a
Hello world!
Breakpoint 1, 0x08048704 in _fini ()
(gdb) print main
$2 = 195
(gdb) disass &main
Dump of assembler code for function main:
0x0804a0b4 <+0>: ret
0x0804a0b5 <+1>: add %al,(%eax)
0x0804a0b7 <+3>: add %al,(%eax)
End of assembler dump.
It's not a valid C++ program. In fact, it crashes for me on Mac OSX after printing "Hello World".
Disassembly shows main
is a static variable, and there are initializers for it:
global constructors keyed to main:
0000000100000e20 pushq %rbp
0000000100000e21 movq %rsp,%rbp
0000000100000e24 movl $0x0000ffff,%esi
0000000100000e29 movl $0x00000001,%edi
0000000100000e2e leave
0000000100000e2f jmp __static_initialization_and_destruction_0(int, int)
Why does it print "Hello World"?
The reason you see "Hello World" printed out is because it's run during static initialization of main
, the static integer variable. Static initializers are called before C++ runtime even tries to call main()
. When it does, it crashes, because main
isn't a valid function, there is just an integer 195 in the data section of the executable.
Other answers indicate this is a valid ret
instruction and it runs fine in Linux, but it crashes on OSX, because the section is marked as non-executable by default.
Why can't a C++ compiler tell that main() isn't a function and stop with linker error?
main()
has C linkage, so the linker can't tell the difference between the type of the symbols. In our case, _main
resides in the data section.
start:
0000000100000eac pushq $0x00
0000000100000eae movq %rsp,%rbp
...
0000000100000c77 callq _main ; 1000010b0
0000000100000c7c movl %eax,%edi
0000000100000c7e callq 0x100000e16 ; symbol stub for: _exit
0000000100000c83 hlt
...
; the text section ends at 100000deb
It's not a legal program, but I think the standard is a little ambiguous as to whether a diagnostic is required or it is undefined behavior. (From a quality of implementation point of view, I'd expect a diagnostic.)
It will set the global variable main
(an integer) to the value of 195 after printing out Hello world. You will still need to define the function main for it to execute.