Prevent file descriptors inheritance during Linux fork

How do you prevent a file descriptor from being copy-inherited across fork() system calls (without closing it, of course)?

I am looking for a way to mark a single file descriptor as NOT to be (copy-)inherited by children at fork(), something like a FD_CLOEXEC-like hack but for forks (so a FD_DONTINHERIT feature if you like). Anybody did this? Or looked into this and has a hint for me to start with?

Thank you

UPDATE:

I could use libc's __register_atfork

 __register_atfork(NULL, NULL, fdcleaner, NULL)

to close the fds in child just before fork() returns. However, the FDs are still being copied so this sounds like a silly hack to me. Question is how to skip the dup()-ing in child of unneeded FDs.

I'm thinking of some scenarios when a fcntl(fd, F_SETFL, F_DONTINHERIT) would be needed:

  • fork() will copy an event FD (e.g. epoll()); sometimes this isn't wanted, for example FreeBSD is marking the kqueue() event FD as being of a KQUEUE_TYPE and these types of FDs won't be copied across forks (the kqueue FDs are skipped explicitly from being copied, if one wants to use it from a child it must fork with shared FD table)

  • fork() will copy 100k unneeded FDs to fork a child for doing some CPU-intensive tasks (suppose the need for a fork() is probabilistically very low and programmer won't want to maintain a pool of children for something that normally wouldn't happen)

Some descriptors we want to be copied (0, 1, 2), some (most of them?) not. I think full FD table duping is here for historic reasons but I am probably wrong.

How silly does this sound:

  • patch fcntl() to support the dontinherit flag on file descriptors (not sure if the flag should be kept per-FD or in a FD table fd_set, like the close-on-exec flags are being kept
  • modify dup_fd() in kernel to skip copying of dontinherit FDs, same as FreeBSD does for kq FDs

consider the program

#include <stdio.h>
#include <unistd.h>
#include <err.h>
#include <stdlib.h>
#include <fcntl.h>
#include <time.h>

static int fds[NUMFDS];
clock_t t1;

static void cleanup(int i)
{
    while(i-- >= 0) close(fds[i]);
}
void clk_start(void)
{
    t1 = clock();
}
void clk_end(void)
{  

    double tix = (double)clock() - t1;
    double sex = tix/CLOCKS_PER_SEC;
    printf("fork_cost(%d fds)=%fticks(%f seconds)\n",
        NUMFDS,tix,sex);
}
int main(int argc, char **argv)
{
    pid_t pid;
    int i;
    __register_atfork(clk_start,clk_end,NULL,NULL);
    for (i = 0; i < NUMFDS; i++) {
        fds[i] = open("/dev/null",O_RDONLY);
        if (fds[i] == -1) {
            cleanup(i);
            errx(EXIT_FAILURE,"open_fds:");
        }
    }
    t1 = clock();
    pid = fork();
    if (pid < 0) {
        errx(EXIT_FAILURE,"fork:");
    }
    if (pid == 0) {
        cleanup(NUMFDS);
        exit(0);
    } else {
        wait(&i);
        cleanup(NUMFDS);
    }
    exit(0);
    return 0;
}

of course, can't consider this a real bench but anyhow:

root@pinkpony:/home/cia/dev/kqueue# time ./forkit
fork_cost(100 fds)=0.000000ticks(0.000000 seconds)

real    0m0.004s
user    0m0.000s
sys     0m0.000s
root@pinkpony:/home/cia/dev/kqueue# gcc -DNUMFDS=100000 -o forkit forkit.c
root@pinkpony:/home/cia/dev/kqueue# time ./forkit
fork_cost(100000 fds)=10000.000000ticks(0.010000 seconds)

real    0m0.287s
user    0m0.010s
sys     0m0.240s
root@pinkpony:/home/cia/dev/kqueue# gcc -DNUMFDS=100 -o forkit forkit.c
root@pinkpony:/home/cia/dev/kqueue# time ./forkit
fork_cost(100 fds)=0.000000ticks(0.000000 seconds)

real    0m0.004s
user    0m0.000s
sys     0m0.000s

forkit ran on a Dell Inspiron 1520 Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz with 4GB RAM; average_load=0.00


If you fork with the purpose of calling an exec function, you can use fcntl with FD_CLOEXEC to have the file descriptor closed once you exec:

int fd = open(...);
fcntl(fd, F_SETFD, FD_CLOEXEC);

Such a file descriptor will survive a fork but not functions of the exec family.


No. Close them yourself, since you know which ones need to be closed.