Does malloc lazily create the backing pages for an allocation on Linux (and other platforms)?

On Linux if I were to malloc(1024 * 1024 * 1024), what does malloc actually do?

I'm sure it assigns a virtual address to the allocation (by walking the free list and creating a new mapping if necessary), but does it actually create 1 GiB worth of swap pages? Or does it mprotect the address range and create the pages when you actually touch them like mmap does?

(I'm specifying Linux because the standard is silent on these kinds of details, but I'd be interested to know what other platforms do as well.)


Solution 1:

Linux does deferred page allocation, aka. 'optimistic memory allocation'. The memory you get back from malloc is not backed by anything and when you touch it you may actually get an OOM condition (if there is no swap space for the page you request), in which case a process is unceremoniously terminated.

See for example http://www.linuxdevcenter.com/pub/a/linux/2006/11/30/linux-out-of-memory.html

Solution 2:

9. Memory (part of The Linux kernel, Some remarks on the Linux Kernel by Andries Brouwer) is a good document.

It contains the following programs that demonstrate Linux's handling of physical memory versus actual memory and explains the kernel's internals.

Typically, the first demo program will get a very large amount of memory before malloc() returns NULL. The second demo program will get a much smaller amount of memory, now that earlier obtained memory is actually used. The third program will get the same large amount as the first program, and then it is killed when it wants to use its memory.

Demo program 1: allocate memory without using it.

#include <stdio.h>
#include <stdlib.h>

int main (void) {
    int n = 0;

    while (1) {
        if (malloc(1<<20) == NULL) {
                printf("malloc failure after %d MiB\n", n);
                return 0;
        }
        printf ("got %d MiB\n", ++n);
    }
}

Demo program 2: allocate memory and actually touch it all.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main (void) {
    int n = 0;
    char *p;

    while (1) {
        if ((p = malloc(1<<20)) == NULL) {
                printf("malloc failure after %d MiB\n", n);
                return 0;
        }
        memset (p, 0, (1<<20));
        printf ("got %d MiB\n", ++n);
    }
}

Demo program 3: first allocate, and use later.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define N       10000

int main (void) {
    int i, n = 0;
    char *pp[N];

    for (n = 0; n < N; n++) {
        pp[n] = malloc(1<<20);
        if (pp[n] == NULL)
            break;
    }
    printf("malloc failure after %d MiB\n", n);

    for (i = 0; i < n; i++) {
        memset (pp[i], 0, (1<<20));
        printf("%d\n", i+1);
    }

    return 0;
}

(On a well-functioning system, like Solaris, the three demo programs obtain the same amount of memory and do not crash, but see malloc() return NULL.)