I am learning how Linux OS installation works, but searching the Internet for this doesn't provide any information for my questions.

Note: This question has been marked as off-topic on Server Fault so I am asking here.

Redhat documentation has pretty neat information but they are in pieces. I can't glue those pieces to get a complete answer.
From those pieces I am able to understand how the bootloader works, how they start up the ramdisk and kernel then systemd or initd.
Can't find any references regarding how the initial OS installation works.
This community has great professionals who are experts in this topic, so I can get solutions for my questions.

There are multiple questions here, please free to answer each question and add reference to it if possible

  1. During the boot process the MBR gets read and the bootloader initialized; during normal setup the kernel gets loaded by the bootloader then after some magic the login screen appears.
  2. If 1 holds then what is the flow while installing the OS? Does the kernel still get loaded to launch the installer script or is the OS installer a minimal script that can be called by the bootloader?
  3. If the kickstart file is used then exactly when does the file get parsed and the contents executed during a fresh OS installation?
  4. What are the files or scripts required for OS installation to work (for normal booting we need initrd, vmlinuz) then what for installers - I think we have the installation tree (ISO extracted and served by HTTPserver)?
  5. RHEL docs says it uses the anaconda installer but it is written in python and how does that work even before the kernel or interpreter gets loaded? I checked whether they compiled to a CPU specific format so it can be run directly on the CPU but can't find anything regarding it.

Solution 1:

If 1 satisfies then what is the flow while installing the OS? Do the kernel still gets loaded to launch the installer script or OS installer is a minimal script that can be called by the bootloader ?

Almost always it's a full OS with a normal app running on it. For example, the graphical installers of both Debian and Fedora were GTK apps running on X11 (Xorg). In both cases there's regular Linux underneath, often with a console login at Alt+F2.

(Similarly, the Windows installer is a setup.exe that runs on WinPE, a Windows variant tailored for this kind of usage. If you hit Shift+F10 it'll launch cmd.exe.)

Having the installer use a full OS makes many things easier, as it can use the same concepts and facilities as the OS that it is installing. For example, the installer needs to be able to format and later read/write the filesystem that your Linux / partition will use – and since the installer is a regular Linux app, it can just use the regular mkfs and mount, that Linux already has.

So the flow for Linux installers specifically is very much like booting a real Linux system:

  1. Firmware loads and runs the bootloader from your installation CD or USB stick.
  2. Bootloader loads a Linux kernel and initramfs from the USB stick (and might be preconfigured to pass some installer-specific kernel parameters).
  3. The initramfs mounts the real root filesystem, which is slightly complex as most Linux "live" systems don't put all the OS files straight on the USB stick, but carry them packaged into a 'squashfs' archive:
    • The initramfs mounts the FAT32 filesystem of the USB stick itself...
    • Then it loop-mounts the SquashFS archive of the real Linux root filesystem from there...
    • Then it mounts an 'overlayfs' or 'unionfs' virtual filesystem that allows the read-only SquashFS base to become writable, with all changes stored in RAM instead...
    • ...and that overlayfs is what becomes the "/" for the actual Linux OS where the installer runs.
  4. The initramfs launches the regular 'init' process, which starts some typical services as well as some installer-specific ones.
  5. Finally, instead of starting a login screen, init directly starts the "installer" app (either a text-mode app on tty1, or an X11 app with Xorg).

As you can see, aside from the overlayfs-based root, it boots like Linux normally would. In general, Linux installers are just a special case of the "live" system, and most distributions publish the tools they use to build them, i.e. both the ISO builder and the actual UI are open-source. For example, you could analyze the archiso tool that Arch Linux uses to create bootable Linux ISOs.

(Arch is a relevant case here, because through the last few years, it did not have an installer as such – it would boot you straight into a Linux shell, where you'd partition the disk and install a set of initial packages, and that's your Linux system installed. So you get a very hands-on look at what Anaconda or d-i would be doing under the hood.)

RHEL docs says it used the anaconda installer but it is written in python and how they work even before the kernel or interpreter gets loaded.

They don't. No part of Anaconda needs to run before the kernel or interpreter gets loaded.

(Remember that as mentioned above the installer actually has its own Linux kernel and its own /usr/bin/python, it doesn't need to wait until the "real" kernel gets installed.)

If the kickstart file is used then exactly when the file gets parsed and the contents are executed during fresh OS installation ?

Generally, it's just read by the installer app. There's no specific "when", each installer does its own thing.