How can a C compiler be written in C? [duplicate]

This question may stem from a misunderstanding of compilers on my part, but here goes...

One can find the following statement in the preface to the first edition of K&R (page xi):

The operating system, the C compiler, and essentially all UNIX applications programs (including all of the software used to prepare this book) are written in C.

(my emphasis)

Here's what I don't understand: doesn't that C compiler have to be compiled itself before it can compile any C code? And if that C compiler is written in C, wouldn't compiling it require an already existing C compiler?!

The only way out of this infinite-regression conundrum (or chicken-and-egg problem) is that the C compiler written in C that K&R are referring to was actually compiled with an already existing C compiler that was written in a language other than C. The C compiler written in C then superseded the latter.

Or am I completely off?


It's called Bootstrapping, quoting from Wikipedia:

If one needs a compiler for language X to obtain a compiler for language X (which is written in language X), how did the first compiler get written? Possible methods to solving this chicken or the egg problem include:

  1. Implementing an interpreter or compiler for language X in language Y. Niklaus Wirth reported that he wrote the first Pascal compiler in Fortran.
  2. Another interpreter or compiler for X has already been written in another language Y; this is how Scheme is often bootstrapped.
  3. Earlier versions of the compiler were written in a subset of X for which there existed some other compiler; this is how some supersets of Java, Haskell, and the initial Free Pascal compiler are bootstrapped.
  4. The compiler for X is cross compiled from another architecture where there exists a compiler for X; this is how compilers for C are usually ported to other platforms. Also this is the method used for Free Pascal after the initial bootstrap.
  5. Writing the compiler in X; then hand-compiling it from source (most likely in a non-optimized way) and running that on the code to get an optimized compiler. Donald Knuth used this for his WEB literate programming system.

And if you are interested, here is Dennis Richie's first C compiler source.


See the Chicken and Egg section of the Wikipedia page:

If one needs a compiler for language X to obtain a compiler for language X (which is written in language X), how did the first compiler get written? Possible methods to solving this chicken or the egg problem include:

  • Implementing an interpreter or compiler for language X in language Y. Niklaus Wirth reported that he wrote the first Pascal compiler in Fortran.
  • Another interpreter or compiler for X has already been written in another language Y; this is how Scheme is often bootstrapped.
  • Earlier versions of the compiler were written in a subset of X for which there existed some other compiler; this is how some supersets of Java, Haskell, and the initial Free Pascal compiler are bootstrapped.
  • The compiler for X is cross compiled from another architecture where there exists a compiler for X; this is how compilers for C are usually ported to other platforms. Also this is the method used for Free Pascal after the initial bootstrap.
  • Writing the compiler in X; then hand-compiling it from source (most likely in a non-optimized way) and running that on the code to get an optimized compiler. Donald Knuth used this for his WEB literate programming system.

Usually, a first compiler is written in another language (directly in PDP11 assembler in this case, or in C for most of the "modern" languages). Then, this first compiler is used to program a compiler written in the language itself.

You can read this page about the history of the C language. You will see that it is also strongly linked to the UNIX system.