Would an executable need an OS kernel to run?

I know that when the source code, in say C++, is compiled, the output from the compiler is the machine code (executable) which I thought were instructions to the CPU directly. Recently I was reading up on kernels and I found out that programs cannot access the hardware directly but have to go through the kernel.

So when we compile some simple source code, say with just a printf() function, and the compilation produces the executable machine code, will each instruction in this machine code be directly executed from memory (once the code is loaded into memory by the OS) or will each command in the machine code still need to go through the OS (kernel) to be executed?

I have read a similar question. It did not explain if the machine code that is generated after compilation is an instruction to the CPU directly or if it will need to again go through the kernel to create the correct instruction for the CPU. I.e., what happens after the machine code is loaded into memory? Will it go through the kernel or directly talk to the processor?


As someone who has written programs that execute without an OS, I offer a definitive answer.

Would an executable need an OS kernel to run?

That depends on how that program was written and built.
You could write a program (assuming you have the knowledge) that does not require an OS at all.
Such a program is described as standalone.
Boot loaders and diagnostic programs are typical uses for standalone programs.

However the typical program written and built in some host OS environment would default to executing in that same host OS environment.
Very explicit decisions and actions are required to write and build a standalone program.


... the output from the compiler is the machine code (executable) which I thought were instructions to the CPU directly.

Correct.

Recently I was reading up on kernels and I found out that programs cannot access the hardware directly but have to go through the kernel.

That's a restriction imposed by a CPU mode that the OS uses to execute programs, and facilitated by certain build tools such as compilers and libraries.
It is not an intrinsic limitation on every program ever written.


So when we compile some simple source code, say with just a printf() function, and the compilation produces the executable machine code, will each instruction in this machine code be directly executed from memory (once the code is loaded into memory by the OS) or will each command in the machine code still need to go through the OS (kernel) to be executed?

Every instruction is executed by the CPU.
An instruction that is unsupported or illegal (e.g. process has insufficient privilege) will cause an immediate exception, and the CPU will instead execute a routine to handle this unusual condition.

A printf() function should not be used as an example of "simple source code".
The translation from an object-oriented high-level programming language to machine code may not be as trivial as you imply.
And then you choose one of the most complex functions from a runtime library that performs data conversions and I/O.

Note that your question stipulates an environment with an OS (and a runtime library).
Once the system is booted, and the OS is given control of the computer, restrictions are imposed on what a program can do (e.g. I/O must be performed by the OS).
If you expect to execute a standalone program (i.e. without an OS), then you must not boot the computer to run the OS.


... what happens after the machine code is loaded into memory?

That depends on the environment.

For a standalone program, it can be executed, i.e. control is handed over by jumping to the program's start address.

For a program loaded by the OS, the program has to be dynamically linked with shared libraries it is dependent on. The OS has to create an execution space for the process that will execute the program.

Will it go through the kernel or directly talk to the processor?

Machine code is executed by the CPU.
They do not "go through the kernel", but nor do they "talk to the processor".
The machine code (consisting of op code and operands) is an instruction to the CPU that is decoded and the operation is performed.

Perhaps the next topic you should investigate is CPU modes.


The kernel is "just" more code. It's just that that code is a layer that lives between the lowest parts of your system and the actual hardware.

All of it runs directly on the CPU, you just transition up through layers of it to do anything.

Your program "needs" the kernel in just the same way it needs the standard C libraries in order to use the printf command in the first place.

The actual code of your program runs on the CPU, but the branches that code makes to print something on screen go through the code for the C printf function, through various other systems and interpreters, each of which do their own processing to work out just how hello world! actually gets printed on your screen.

Say you have a terminal program running on a desktop window manager, running on your kernel which in turn is running on your hardware.

There's a lot more that goes on but lets keep it simple...

  1. In your terminal program you run your program to print hello world!
  2. The terminal sees that the program has written (via the C output routines) hello world! to the console
  3. The terminal program goes up to the desktop window manager saying "I got hello world! written at me, can you put it at position x, y please?"
  4. The desktop window manager goes up to the kernel with "one of my programs wants your graphics device to put some text at this position, get to it dude!"
  5. The kernel passes the request out to the graphics device driver, which formats it in a way that the graphics card can understand
  6. Depending on how the graphics card is connected other kernel device drivers need to be called to push the data out on physical device buses such as PCIe, handling things like making sure the correct device is selected, and that the data can pass through relevant bridge or converters
  7. The hardware displays stuff.

This is a massive oversimplification for description only. Here be dragons.

Effectively everything you do that needs hardware access, be it display, blocks of memory, bits of files or anything like that has to go through some device driver in the kernel to work out exactly how to talk to the relevant device. Be it a filesystem driver on top of a SATA hard disk controller driver which itself is sitting on top of a PCIe bridge device.

The kernel knows how to tie all these devices together and presents a relatively simple interface for programs to do things without having to know about how to do all of these things themselves.

Desktop window managers provide a layer that means that programs don't have to know how to draw windows and play well with other programs trying to display things at the same time.

Finally the terminal program means that your program doesn't need to know how to draw a window, nor how to talk to the kernel graphics card driver, nor all of the complexity to do with dealing with screen buffers and display timing and actually wiggling the data lines to the display.

It's all handled by layers upon layers of code.


It depends on the environment. In many older (and simpler!) computers, such as the IBM 1401, the answer would be "no". Your compiler and linker emitted a standalone "binary" that ran without any operating system at all. When your program stopped running, you loaded a different one, which also ran with no OS.

An operating system is needed in modern environments because you aren't running just one program at a time. Sharing the CPU core(s), the RAM, the mass storage device, the keyboard, mouse, and display, among multiple programs at once requires coordination. The OS provides that. So in a modern environment your program can't just read and write the disk or SSD, it has to ask the OS to do that on its behalf. The OS gets such requests from all the programs that want to access the storage device, implements about things like access controls (can't allow ordinary users to write to the OS's files), queues them to the device, and sorts out the returned information to the correct programs (processes).

In addition, modern computers (unlike, say, the 1401) support the connection of a very wide variety of I/O devices, not just the ones IBM would sell you in the old days. Your compiler and linker can't possibly know about all of the possibilities. For example, your keyboard might be interfaced via PS/2, or USB. The OS allows you to install device-specific "device drivers" that know how to talk to those devices, but present a common interface for the device class to the OS. So your program, and even the OS, doesn't have to do anything different for getting keystrokes from a USB vs a PS/2 keyboard, or for accessing, say, a local SATA disk vs a USB storage device vs storage that's somewhere off on a NAS or SAN. Those details are handled by device drivers for the various device controllers.

For mass storage devices, the OS provides atop all of those a file system driver that presents the same interface to directories and files regardless of where and how the storage is implemented. And again, the OS worries about access controls and serialization. In general, for example, the same file shouldn't be opened for writing by more than one program at a time without jumping through some hoops (but simultaneous reads are generally ok).

So in a modern general-purpose environment, yes - you really need an OS. But even today there are computers such as real-time controllers that aren't complicated enough to need one.

In the Arduino environment, for example, there isn't really an OS. Sure, there's a bunch of library code that the build environment incorporates into every "binary" it builds. But since there is no persistence of that code from one program to the next, it's not an OS.


I think many answers misunderstand the question, which boils down to this:

A compiler outputs machine code. Is this machine code executed directly by a CPU, or is it "interpreted" by the kernel?

Basically, the CPU directly executes the machine code. It would be significantly slower to have the kernel execute all applications. However, there are a few caveats.

  1. When an OS is present, application programs typically are restricted from executing certain instructions or accessing certain resources. For example, if an application executes an instruction which modifies the system interrupt table, the CPU will instead jump to an OS exception handler so that the offending application is terminated. Also, applications are usually not allowed to read/write to device memory. (I.e. "talking to the hardware".) Accessing these special memory regions is how the OS communicates with devices like the graphics card, network interface, system clock, etc.

  2. The restrictions an OS places on applications are achieved by special features of the CPU, such as privilege modes, memory protection, and interrupts. Although any CPU you would find in a smartphone or PC has these features, certain CPUs do not. These CPUs do indeed need special kernels which "interpret" application code in order to achieve the features that are desired. A very interesting example is the Gigatron, which is an 8-instruction computer you can build out of chips which emulates a 34-instruction computer.

  3. Some languages like Java "compile" to something called Bytecode, which is not really machine code. Although in the past they were interpreted to run the programs, these days something called Just-in-Time compilation is usually used so they do end up running directly on the CPU as machine code.

  4. Running software in a Virtual Machine used to require its machine code to be "interpreted" by a program called a Hypervisor. Due to enormous industry demand for VMs, CPU manufacturers have added features like VTx to their CPUs to allow most instructions of a guest system to be executed directly by the CPU. However, when running software designed for an incompatible CPU in a Virtual Machine (for example, emulating a NES), the machine code will need to be interpreted.


When you compile your code, you create so-called "object" code that (in most cases) depends on system libraries (printf for example), then your code is wrapped by linker that will add kind of program loader that your particular operating system can recognize (that is why you can't run program compiled for Windows on Linux for example) and know how to unwrap your code and execute. So your program is as a meat inside of a sandwich and can be eaten only as a bundle, in whole.

Recently I was reading up on Kernels and I found out that programs cannot access the hardware directly but have to go through the kernel.

Well it is halfway true; if your program is a kernel mode driver then actually you can access directly hardware if you know how to "talk" to hardware, but usually (especially for undocumented or complicated hardware) people use drivers that are kernel libraries. This way you can find API functions that know how to talk to hardware in almost human readable way without the need to know addresses, registers, timing and bunch of other things.

will each instruction in this machine code be directly be executed from the memory (once the code is loaded into the memory by OS) or will each each command in the machine code still need to go through the OS(kernel) to be executed

Well, the kernel is as a waitress, whose responsibility is to walk you to a table and serve you. The only thing it can't do - it is eat for you, you should do that yourself. The same with your code, kernel will unpack your program to a memory and will start your code which is machine code executed directly by CPU. A kernel just need to supervise you - what you are allowed and what you're not allowed to do.

it not explain if the machine code that is generated after compilation is an instruction to the CPU directly or will it need to again go through the kernel to create the correct instruction for the CPU?

Machine code that is generated after compilation is an instruction to the CPU directly. No doubt on that. The only thing you need to keep in mind, not all code in compiled file are actual machine's/CPU code. Linker wrapped your program with some meta data that only kernel can interpret, as a clue - what to do with your program.

What happens after the machine code loaded on to the memory? Will it go through the kernel or directly talk to the processor.

If your code is just simple opcodes like addition of two registers then it will be executed directly by CPU without kernel assistance, but if your code using functions from libraries then such calls will be assisted by kernel, as in example with waitress, if you want to eat in a restaurant they would give you a tools - fork, spoon (and it still their assets) but what you will do with it, - it up to your "code".

Well, just to prevent flame in comments - it is really oversimplified model that I hope would help OP understand base things, but good suggestions to improve this answer are welcome.