Why do you need to recompile C/C++ for each OS? [duplicate]

This is more of a theoretical question than anything. I'm a Comp sci major with a huge interest in low level programming. I love finding out how things work under the hood. My specialization is compiler design.

Anyway, as I'm working on my first compiler, things are occurring to me that are kind of confusing.

When you write a program in C/C++, the traditional thing people know is, a compiler magically turns your C/C++ code into native code for that machine.

But something doesn't add up here. If I compile my C/C++ program targeting the x86 architecture, it would seem that the same program should run on any computer with the same architecture. But that doesn't happen. You need to recompile your code for OS X or Linux or Windows.(And yet again for 32-bit vs 64-bit)

I'm just wondering why this is the case? Don't we target the CPU architecture/instruction set when compiling a C/C++ program? And a Mac OS and a Windows Os can very much be running on the same exact architecture.

(I know Java and similar target a VM or CLR so those don't count)

If I took a best-shot answer at this, I'd say C/C++ must then compile to OS-specific instructions. But every source I read says the compiler targets the machine. So I'm very confused.


Solution 1:

Don't we target the CPU architecture/instruction set when compiling a C/C++ program?

No, you don't.

I mean yes, you are compiling for a CPU instruction set. But that's not all compilation is.

Consider the simplest "Hello, world!" program. All it does is call printf, right? But there's no "printf" instruction set opcode. So... what exactly happens?

Well, that's part of the C standard library. Its printf function does some processing on the string and parameters, then... displays it. How does that happen? Well, it sends the string to standard out. OK... who controls that?

The operating system. And there's no "standard out" opcode either, so sending a string to standard out involves some form of OS call.

And OS calls are not standardized across operating systems. Pretty much every standard library function that does something you couldn't build on your own in C or C++ is going to talk to the OS to do at least some of its work.

malloc? Memory doesn't belong to you; it belongs to the OS, and you maybe are allowed to have some. scanf? Standard input doesn't belong to you; it belongs to the OS, and you can maybe read from it. And so on.

Your standard library is built from calls to OS routines. And those OS routines are non-portable, so your standard library implementation is non-portable. So your executable has these non-portable calls in it.

And on top of all of that, different OSs have different ideas of what an "executable" even looks like. An executable isn't just a bunch of opcodes, after all; where do you think all of those constant and pre-initialized static variables get stored? Different OSs have different ways of starting up an executable, and the structure of the executable is a part of that.

Solution 2:

How do you allocate memory? There's no CPU instruction for allocating dynamic memory, you have to ask the OS for the memory. But what are the parameters? How do you invoke the OS?

How do you print output? How do you open a file? How do you set a timer? How do you display a UI? All of these things require requesting services from the OS, and different OSes provide different services with different calls necessary to request them.