Why does Linux use a swap partition when the kernel supports paging/virtual memory anyway?

Yes it is just a matter of terminology, in many cases a swap partition is used as virtual memory.

The reason UNIX and UNIX-like systems prefer swap partitions to page-files is that they can be contiguous which results in lower seek times compared to a page-file which may be fragmented.

I don't know where you got the notion that “swapping means, that a process is either completely in physical memory or on the hard drive”. That meaning has not been in use for a few decades. Quoting Wikipedia:

Historically, swapping referred to moving from/to secondary storage a whole program at a time, in a scheme known as roll-in/roll-out. In the 1960s, after the concept of virtual memory was introduced—in two variants, either using segments or pages—the term swapping was applied to moving, respectively, either segments or pages, between disk and memory. Today with the virtual memory mostly based on pages, not segments, swapping became a fairly close synonym of paging, although with one difference.[dubious – discuss]

Indeed, in any context involving Linux (or other unix systems for that matter), paging and swapping are pretty much synonymous. Both refer to a use of virtual memory where a the data of page can either be stored in RAM or on disk. (A page is 4kB on any device you're likely to encounter.) The program using the memory page doesn't care or even know where the data is stored, it just keeps using the virtual address. The kernel transfers data between the RAM and the disk and updates the MMU tables as it goes along so that the entry for the virtual address either points to a physical page in memory, or contains a special value that causes the processor to execute some kernel code which will load the appropriate data from the disk.

Paging refers to this generic process. Swapping refers to the case where the on-disk data is in a dedicated area: the swap area (a swap partition or swap file). Paging can also be done between RAM and a file, and in this case it's usually not refered to as swapping. For example, when you execute a program, the code has to be loaded into memory to be executed; if a code page needs to be evicted from RAM to make room for something else, then there's no need to write this page onto the swap area, because it can be loaded back from the program file. (This can be done for all read-only data, not just program code.)

If the physical memory is (almost) full, the kernel looks for a page in RAM (not a whole process) that hasn't been used recently. If that page reproduces the content of a disk file (there are tables in the kernel to indicate this), the page can be reclaimed. If not, the page is written out to swap, then reclaimed. Either way the kernel updates the entry in the process's virtual memory table (which becomes the MMU table while the process executes) to mark it as not in RAM and can then reuse the physical page for something else (a different program, or another page of the same program).

The virtual memory/paging facility lets a kernel "virtualize" memory to userspace processes. The kernel can take pages from physical memory, and arrange them through paging so they appear contiguous to a userspace process.

A limit can be set on a userspace process's memory and if the process goes beyond it a "page fault" occurs, which causes a CPU exception which bounces back to the kernel. This prevents the userspace program from messing with memory allocated to the kernel or other programs, without the kernel's permission.

Typically userspace programs ask the kernel to extend this limit via well defined interfaces (called by the C functions malloc() and free() for example.). The kernel is responsible for keeping track of how much and what memory a program is allocated.

This "page fault" mechanism can also let the kernel swap the page the process was trying to access with one from disk, if the kernel is able to overprovision memory (and both Windows and Linux support this) hence why it is called swapping. If the memory access was indeed invalid (i.e. the process is trying to access memory it didn't ask for first) then typically the process will be killed with a SIGSEGV.

So "swapping" is an additional feature (in Linux you can actually disable it entirely if you want) that depends on virtual memory/paging, but isn't required just because a CPU has virtual memory/paging. The concepts are not the same but swapping depends on paging/virtual memory to exist.

Also, after more closely reading your question, "paging" is sometimes uses as a synomym for "swapping" - but I've never heard of "swapping" meaning the whole process's memory is swapped out vs. "paging" meaning only part of it is swapped out.

But why does linux need a swap partition then? If the physical memory is full, some processes will be outsourced to the hard drive and a new process will be mapped from virtual memory to physical memory.

"Virtual memory" is physical memory, just "remapped." The MMU hardware cannot directly map to any storage device. The MMU can throw a fault that tells the kernel a process tried to access memory it shouldn't have - and the kernel can use this mechanism to see that a process wants something back from disk that it thought was in memory and then do the "swap". The point being it's the operating system that decides to save pages to disk so it can use those pages for other processes, not the hardware.

Why does Linux use a swap partition when the kernel supports paging/virtual memory anyway?

Related

Recent Posts