What is virtual memory?

I was double checking my notes for 'Virtual Memory' and the definition in my text book is:

Process of allocating a section of secondary storage to act as part of the main memory

Where as Wikipedia says:

Virtual memory is a computer system technique which gives an application program the impression that it has contiguous working memory (an address space)

and (Wikipedia also says)

Note that "virtual memory" is more than just "using disk space to extend physical memory size"

Can anyone offer any clarification as to which is correct?


Solution 1:

Note that "virtual memory" is more than just "using disk space to extend physical memory size"

Virtual memory is a layer of abstraction provided to each process. The computer has, say, 2GB of physical RAM, addressed from 0 to 2G. A process might see an address space of 4GB, which it has entirely to itself. The mapping from virtual addresses to physical addresses is handled by a memory management unit, which is managed by the operating system. Typically this is done in 4KB "pages".

This gives several features:

  1. A process can not see memory in other processes (unless the OS wants it to!)
  2. Memory at a given virtual address may not be located at the same physical address
  3. Memory at a virtual address can be "paged out" to disk, and then "paged in" when it is accessed again.

Your textbook defines virtual memory (incorrectly) as just #3.

Even without any swapping, you particularly need to be aware of virtual memory if you write a device driver for a device which does DMA (direct memory access). Your driver code runs on the CPU, which means its memory accesses are through the MMU (virtual). The device probably does not go through the MMU, so it sees raw physical addresses. So as a driver writer you need to ensure:

  1. Any raw memory addresses you pass to the hardware are physical, not virtual.
  2. Any large (multi page) blocks of memory you send are physically contiguous. An 8K array might be virtually contiguous (through the MMU) but two physically separate pages. If you tell the device to write 8K of data to the physical address corresponding to the start of that array, it will write the first 4K where you expect, but the second 4K will corrupt some memory somewhere. :-(

Solution 2:

I'll try to start slowly, and then put this all together for you. It's like this:

Virtual memory, as commonly used, refers to "paging". As the name suggests, paging is like a human notepad.

When you're working out simple sums, or learning simple information, you do it all in your head: you just load up all the information, process it, and get the answer. This is like a computer loading files from the hard drive -- it loads up the programs or pictures or other information it needs to work into its "real memory" (or "physical memory") and works on them with it's "brain" (its processor).

However, when you're learning complex information, or working with complex sums, you might not be able to fit all that in your head at once. You get confused, start to slow down, fail to keep it all in there at once, and have to forget something to remember something else.

The human solution is to use a notepad. We note down on pages all the things we can't remember at once, but refer to them while doing the sums. We might not be able to remember a huge list of sales figures for the month, but we can look at the pages, get the information a bit at a time, and process each bit. This is like the computer "paging" its memory -- writing pages full of information, and putting it into "virtual memory" for later reference, and realising it needs a page, and loading that page back from virtual memory into real memory. On linux and unix, the place where these pages are stored is literally called a "pagefile", and the pages of data in memory are literally called "pages". Different systems have different names for these things, but the general concept is much the same.

So really, paging is very simple. All of the pages of information don't fit in memory, so some pages are put on disk, and loaded again later.

Now, where it gets more complicated is that, modern systems feature memory mapping and memory protection, which is all usually handled by the same hardware system in the computer: the memory management unit, or MMU.

In a (modern) multitasking computer, which can run many programs at once, and features memory protection, each program is usually seperated from other programs running on the same system. This way, one program cannot alter another program simply by accessing its memory -- the MMU physically separates one program's address space from that of others. In other words, user's programs don't see other user's programs or even other programs. They don't see "real memory" -- they see their own "virtual memory".

Now, this memory isolation concept and the pagefile concept are two conceptually different things, which is probably why you're confused. However, the key is that they both work using the MMU -- the memory management unit, which splits memory into pages, and maps pages to a virtual address space.

So, when a program asks for the memory at a certain "memory address", what really happens is that the memory pages for that program and their corresponding addresses (the program's "address space") are looked up, and the page that corresponds to that memory block is found. That page can either be loaded somewhere in real memory, in which case the program is given access, or, it can be paged out to a disk. If it is paged out, then it triggers a "page fault" -- the disk is accessed, and the page gets loaded into memory. So the program works even when there isn't enough memory, but it runs SLOWLY, if it's having to use disk for what would normally be a very fast memory access.

Now, if there isn't enough space to load that page into memory, then you have a problem. In that case, some OTHER page that's already in memory has to be "swapped" to disk, so the first program's page can be loaded. Or, they might equally be pages from the same program. You see this sometimes in graphics programs, for instance, on heavily loaded systems, when part of the picture is loaded slowly and drawn quickly, then the next part is loaded equally slowly and drawn quickly, and when you go back to work with the first part, it's slow AGAIN. That's because they're being loaded in to be worked on, then swapped out again, so something else can be worked on. Obviously, this is a very slow way to work, and what you really need is more REAL memory.