Understanding Virtual Address, Virtual Memory and Paging
Before I answer your questions (I hope I do), here are a few introductory remarks:
Remarks
The problem here is that “virtual memory” has two senses. “Virtual memory” as a technical term used by low-level programmers has (almost) nothing to do with “virtual memory” as explained to consumers.
In the technical sense, “virtual memory” is a memory management system whereby every process has its own virtual address space, and memory addresses in that address space are mapped to physical memory addresses by the OS kernel with hardware support (uses terms like TLB, multi-level page tables, page faults and walks, etc.). This is the sense of VM that you are interested in (described below).
In the non-technical sense, “virtual memory” is disk space used in lieu of RAM (uses terms like swap, backing store, etc.). This is the sense of VM that you're not particularly interested in, but it seems that you've seen some material that deals primarily with this sense of the term or muddles the two.
Question 1
what happens when my programs want to access memory address 0xFFFFFFFFF? I do only have 4GB
In this case, your “Theory 1” is closer.
VM decouples the addresses that your program “sees” and works with — virtual addresses — from physical addresses. Your 4GiB of memory may be at physical addresses from 0x0 to 0xFFFFFFFF (8 F's), but the address 0xFFFFFFFFF (9 F's) is in the user-space (in canonical layout) of virtual addresses. Provided that 0xFFFFFFFFF is in a block allocated to the process, the CPU and kernel (in concert) will translate the page address 0xFFFFFF000 (assuming a 4k page, we just hack off the lower 12 bits) to a real physical page, which could have (almost) any physical base address. Suppose the physical address of that page is 0xeac000 (a relationship established when the kernel gave you the virtual page 0xFFFFFF000), then the byte at virtual address 0xFFFFFFFFF is at physical address 0x00eacfff.
When you dereference 0xFFFFFFFFF (assuming 4k pages), the kernel “asks” the CPU to access that virtual address, and the CPU chops off the lower 12 bits, and looks up the page in the dTLB (translation lookaside buffers are virtual-to-physical page-mapping caches; there's at least one for data and one for instructions). If there's a hit, the CPU constructs the real physical address and fetches the value. If there's a TLB miss, the CPU raises a page fault, which causes the kernel to consult (or “walk”) the page tables to determine the right physical page, and “returns” that value to the CPU, which caches it in the dTLB (it's highly likely to be reused almost immediately). The kernel then asks the CPU for that address again and this time, it will succeed without triggering a walk.
I admit that this description is pretty crummy (reflecting my own level of knowledge). In particular, the exact way that a particular process is identified in the TLB is not 100% clear to me and at least somewhat hardware-specific. It used to be that every context switch needed a full TLB flush, but more recent Intel CPUs have a 6-bit “PID” field, which means that flushes, while still required sometimes, aren't always required on a context switch. Further crumminess arises from my failure to describe multi-level TLBs, PTEs (page table entries) and address the significance of this on data and instruction caching (although I do know that modern hardware can be seeing if it's at all possible that an address is in some cache level at the same time as the TLB lookup).
Question 2
How processes are put in Virtual Memory? I mean does each process has 0x0 - 0xFFFFFFFFF virtual memory space available for them or there is only one Virtual Memory address space where all the process are placed?
Each process has its own completely distinct virtual memory space. This is (almost) the entire point of VM.
In the olden days, the TLB was not “process aware” in any sense. Every context switch meant that the TLBs had to be flushed completely. Nowadays, TLB entries have a short “process context” (PCID?) field and support selective flushing, so you can kinda/sorta think of it as the PID (or, rather, the PCID: some kind of hash of the PID) being prepended to the virtual page address, so the TLB is more process aware, and only those entries need to be flushed when there's a PCID collision with another process (two processes map to the same PCID).
Question 3
Is there one big giant page table which includes all the pages for every process or each process has its own page table?
This is OS-specific, of course, but my understanding is that Linux has one multi-level set of page tables where the entries (PTEs) are tagged with the PID, rather than there being separate per-process page tables. I think the basic reason for this is that a lot of virtual-to-physical mappings are n:1 rather than 1:1, since them all being 1:1 would largely defeat a major purpose of VM: think about shared readonly pages containing the instructions for libraries like libc
, or copy-on-write data pages shared between parent and child after a fork. Duplicating these entries for every process in per-process page tables is less efficient than adding/deleting the process-specific entries to/from a common set of page tables when a process is created/exits.
Where Disk Comes In
Once you have a VM system, it becomes almost trivial to add the ability to retrieve a page from disk when a page fault occurs, and implement “aging” for PTEs so that the least recently used pages can be put on disk. Although this is an important feature on memory-constrained systems, is almost entirely irrelevant to understanding how a VM system actually works.
Some people use the term "Virtual memory" as if it were synonymous with the Page File, since the Page File represents the part of your allocated memory that is not "real" memory (i.e. RAM). But most people consider "Virtual Memory" to be the entire abstraction layer that the Operating System gives to programs, which combines the RAM and the Page File.
I'm not sure which of these definitions is favored by Mac OS, though it seems unlikely that your computer would not have any paged memory allocated, so I'm guessing that it probably is adding 8GB of paged memory to your 8GB of actual RAM, for a total of 16GB of available (virtual) memory.
Remember that because the Operating System manages memory allocation and deallocation requests, it is free to do just about whatever it wants. My understanding is that most operating systems have different memory allocation tables for each process, so they could literally give the same virtual memory address to multiple programs, but those memory addresses would map to different actual blocks in memory. So a 64-bit operating system can allocate the maximum amount of 32-bit addresses to multiple 32-bit programs--they're not all limited to the same 32-bit memory addresses.
However, there are limits: the operating system can have limits set to the size that the page file is allowed to grow to. So unless you've deliberately told your Operating System to do so, it will probably not have 64 GB of total Virtual Memory. And even if it did, it can't allocate all 64 GB to every program, so you'd most likely have an OutOfMemory
error before the OS allocates a virtual address at 0xFFFFFFFFF
to your program. (In fact, I wouldn't be surprised to learn that 0xFFFFFFFFF
is actually a reserved error-code location, similar to 0x0
.) But since the addresses that your program knows about have no correlation to true memory addresses, there's a possibility that you'd end up being allocated a memory address that your program thinks of as 0xFFFFFFFFF
, even if the operating system isn't using anywhere near that much memory.
Is there one big giant page table which includes all the pages for every process or each process has its own page table?
Likely both... and then some.
- Each process has its private memory table, and the OS will actively prevent your program from accessing a memory address that hasn't been allocated to this table.
- There's also such a thing as Shared Memory, so two processes which need to use the same information can create an area of shared memory and have addresses in that memory space be accessible by both.
- The Operating System itself obviously needs to have some way to track how much overall memory is available, which address spaces are free/used, and which virtual memory blocks have been allocated to which locations in RAM or in the page file.
So, supposing a process has been allocated memory at address 0x00000002
, when it goes to load the value out of that memory address, the operating system might recognize that this actually maps to the real memory address 0x00000F23
, and that is the memory address whose value will actually be fetched into the CPU register. Or, it could realize that it has moved the page containing that address onto the disk somewhere, in which case the operating system will find an empty part of memory and load the page's data from the disk into that memory first. (Again, this memory address doesn't have any correlation with the original memory address that the program requested.)
If there isn't any empty memory to pull the page from, the OS will first have to move some data out of memory and into the page file. It tries to intelligently determine which memory is least likely to be used in the near future. But sometimes you end up with memory constantly getting requested shortly after it gets swapped to the disk, only to replace the next piece of memory that a program was about to request. This "thrashing" is what causes computers with insufficient memory to go really, really slow, since disk accesses are orders of magnitude slower than memory accesses.