What kind of C is an operating system written in?
Excellent questions, all. The answer is: little to none of the standard C library is available in the "dialect" of C used to write an operating system. In the Linux kernel, for example, the standard memory allocation functions malloc, nmalloc, free etc. are replaced with special kernel-internel memory allocation functions kmalloc and kfree, with special restrictions on their use. The operating system must provide its own "heap" -- in the Linux kernel, physical memory pages that have been allocated for kernel use must be non-pageable and often physically continguous. See This linux journal article on kmalloc and kfree. Similarly, the operating system kernel maintains its own special call stack, the use of which requires, from memory, special support from the GCC compiler.
Also, how much of an operating system would actually be written in C? All of it?
As far as I'm aware, operating systems are overwhelmingly written in C. Some architecture-specific features are coded in assembler, but usually very little to improve portability and maintainability: the Linux kernel has some assembler but tries to minimize it as much as possible.
What about architecture dependent code? What about the higher levels of abstraction--does that ever get written in higher level languages, like C++?
Usually the kernel will be written in pure C, but sometimes the higher level frameworks and APIs are written in a higher level language. For example, the Cocoa framework/API on MacOS is written in Objective C, and the BeOS higher level APIs were written in C++. Much of Microsoft's .NET framework was written in C#, with the "Common Language Runtime" written in a mix of C++ and assembler. The QT widget set most often used on Linux is written in C++. Of course, this introduces philosophical questions about what counts as "the operating system."
The Linux kernel is definitely worth looking at for this, although, it must be said, it is huge and intimidating for anyone to read from scratch.
What kind of C?
Mostly ANSI C, with a lot of time looking at the machine code it generates.
But, does an OS even have a heap?
Malloc asks the operating system for a pointer to some memory it is allowed to use. If a program running on an OS (user mode) tries to access memory it doesn't own, it will give a segmentation fault. An OS is allowed to directly access all the physical memory on the system, malloc not needed, no seg-faults on any address that exists.
What about a call stack?
The call stack actually often works at the hardware level, with a link register.
For file access, the OS needs access to a disk driver, which needs to know how to read the file system that's on the disk (there are a lot of different kinds) Sometimes the OS has one built in, but I think it's more common that the boot loader hands it one to start with, and it loads another (bigger) one. The disk driver has access to the hardware IO of the physical disk, and builds from that.
C is a very low level language, and you can do a lot of things directly. Any of the C library methods (like malloc, printf, crlscr etc) need to be implemented first, to invoke them from C (Have a look at libc concepts for example). I'll give an example below.
Let us see how the C library methods are implemented under the hood. We'll go with a clrscr example. When you implement such methods, you'll access system devices directly. For ex, for clrscr (clearing the screen) we know that the video memory is resident at 0xB8000. Hence, to write to screen or to clear it, we start by assigning a pointer to that location.
In video.c
void clrscr()
{
unsigned char *vidmem = (unsigned char *)0xB8000;
const long size = 80*25;
long loop;
for (loop=0; loop<size; loop++) {
*vidmem++ = 0;
*vidmem++ = 0xF;
}
}
Let us write our mini kernel now. This will clear the screen when the control is handed over to our 'kernel' from the boot loader. In main.c
void main()
{
clrscr();
for(;;);
}
To compile our 'kernel', you might use gcc to compile it to a pure bin format.
gcc -ffreestanding -c main.c -o main.o
gcc -c video.c -o video.o
ld -e _main -Ttext 0x1000 -o kernel.o main.o video.o
ld -i -e _main -Ttext 0x1000 -o kernel.o main.o video.o
objcopy -R .note -R .comment -S -O binary kernel.o kernel.bin
If you noticed the ld parameters above, you see that we are specifying the default load location of your Kernel as 0x1000. Now, you need to create a boot loader. From your boot loader logic, you might want to pass control to your Kernel, like
jump 08h:01000h
You normally write your boot loader logic in Asm. Even before that, you may need to have a look at how a PC Boots - Click Here.
Better start with a tinier Operating system to explore. See this Roll Your Own OS Tutorial
http://www.acm.uiuc.edu/sigops/roll_your_own/
But how much of it, and what kind of C?
Some parts must be written in assembly
I mean, in C, if you needed some heap memory, you would call malloc. But, does an OS even have a heap? As far as I know, malloc asks the operating system for memory and then adds it to a linked list, or binary tree, or something.
Some OS's have a heap. At a lowest level, they are slabs of memory that are dolled out called pages. Your C library then partitions with its own scheme in a variable sized manner with malloc. You should learn about virtual memory which is a common memory scheme in modern OS's.
When you want to open or create a file in C, the appropriate functions ask the operating system for that file. so... What kind of C is on the other side of that call?
You call into assembly routines that query hardware with instructions like IN and OUT. With raw memory access sometimes you have regions of memory that are dedicated to communicating to and from hardware. This is called DMA.
I'm not sure if I'll wind up being able to follow the code--or if I'll be caught in an inescapably complex web of stuff I've never seen before.
Yes you will. You should pick up a book on hardware and OS's first.
I mean, in C, if you needed some heap memory, you would call malloc. But, does an OS even have a heap? As far as I know, malloc asks the operating system for memory and then adds it to a linked list, or binary tree, or something. What about a call stack?
A lot of what you say in your question is actually done by the runtime library in userspace.
All that OS needs to do is to load the program into memory and jump to it's entry point, most details after that can be done by the user space program. Heap and stack are just areas of the processes virtual memory. Stack is just a pointer register in the cpu.
Allocating physical memory is something that is done on the OS level. OS usually allocates fixed size pages, which are then mapped to a user space process.