Linux NFLOG - documentation, configuration from C

Several different places (e.g. http://wiki.wireshark.org/CaptureSetup/NFLOG) recommend using Linux's "NFLOG" firewall module to capture packets generated by a particular UID, like this:

# iptables -A OUTPUT -m owner --uid-owner 1000 -j CONNMARK --set-mark 1
# iptables -A INPUT -m connmark --mark 1 -j NFLOG --nflog-group 30 
# iptables -A OUTPUT -m connmark --mark 1 -j NFLOG --nflog-group 30 
# dumpcap -i nflog:30 -w uid-1000.pcap

I have not been able to find any documentation for how this works exactly (in particular, netfilter.org has a whole lot of poorly-written library API documentation and, as far as I can tell, nothing whatsoever on the semantics of the actual kernel-level firewall rules), so I have several questions:

  1. Is there any damn documentation and where is it hiding?

  2. Is the CONNMARK thing actually necessary? That is, would this work just as well?

    # iptables -A INPUT -m owner --uid-owner 1000 -j NFLOG --nflog-group 30 
    # iptables -A OUTPUT -m owner --uid-owner 1000 -j NFLOG --nflog-group 30
    
  3. Is it necessary to have "ulogd" running for this to work?

  4. Is there a way to tell the kernel to pick an unallocated group number for me and tell me what it is?

  5. Is there a way to tell the kernel that these filter rules should be automatically deleted when process X terminates? (Process X would not be running as uid 1000.)

  6. Presumably the iptables command makes some special ioctl calls or something to configure the firewall. Is there a C library that can be used to do the same from within a program (namely, "process X" from Q4)?


Is there any damn documentation and where is it hiding?

There are examples on the netfilter site which help explain the functionality. Here is a function I wrote in my own code that sets up the netfilter NFLOG.

Here are the examples they provide: http://www.netfilter.org/projects/libnetfilter_log/doxygen/files.html

void setup_netlogger_loop(
    int groupnum,
    queue_t queue)
{
  int sz;
  int fd = -1;
  char buf[BUFSZ];
  /* Setup handle */
  struct nflog_handle *handle = NULL;
  struct nflog_g_handle *group = NULL;

  memset(buf, 0, sizeof(buf));

  /* This opens the relevent netlink socket of the relevent type */
  if ((handle = nflog_open()) == NULL){
    sd_journal_perror("Could not get netlink handle");
    exit(EX_OSERR);
  }

  /* We tell the kernel that we want ipv4 tables not ipv6 */
  if (nflog_bind_pf(handle, AF_INET) < 0) {
    sd_journal_perror("Could not bind netlink handle");
    exit(EX_OSERR);
  }

  /* Setup groups, this binds to the group specified */
  if ((group = nflog_bind_group(handle, groupnum)) == NULL) {
    sd_journal_perror("Could not bind to group");
    exit(EX_OSERR);
  }
  if (nflog_set_mode(group, NFULNL_COPY_PACKET, 0xffff) < 0) {
    sd_journal_perror("Could not set group mode");
    exit(EX_OSERR);
  }
  if (nflog_set_nlbufsiz(group, BUFSZ) < 0) {
    sd_journal_perror("Could not set group buffer size");
    exit(EX_OSERR);
  }
  if (nflog_set_timeout(group, 1500) < 0) {
    sd_journal_perror("Could not set the group timeout");
  }

  /* Register the callback */
  nflog_callback_register(group, &queue_push, (void *)queue);

  /* Get the actual FD for the netlogger entry */
  fd = nflog_fd(handle);

  /* We continually read from the loop and push the contents into
     nflog_handle_packet (which seperates one entry from the other),
     which will eventually invoke our callback (queue_push) */    
  for (;;) {
    sz = recv(fd, buf, BUFSZ, 0);
    if (sz < 0 && errno == EINTR)
      continue;
    else if (sz < 0)
      break;

    nflog_handle_packet(handle, buf, sz);
  }
}

Is the CONNMARK thing actually necessary? That is, would this work just as well?

It is unnecessary.

Is it necessary to have "ulogd" running for this to work?

No -- in fact I dont use it in this application.

Is there a way to tell the kernel to pick an unallocated group number for me and tell me what it is?

Not that I am aware of. In any case this would be useless if you have NFLOG targets setup for HTTP, one to log dropped packets that were FTP and one that was scanning for SMTP strings. In this scenario you cannot determine which rule is bound to which group, and thus which group should be listened upon.

Is there a way to tell the kernel that these filter rules should be automatically deleted when process X terminates? (Process X would not be running as uid 1000.)

No, but the kernel fills up a buffer only up to a maximum size then will discard data. It does not pose a performance impact in terms of using up too much memory having rules not listened to.

Presumably the iptables command makes some special ioctl calls or something to configure the firewall. Is there a C library that can be used to do the same from within a program (namely, "process X" from Q4)?

There is no netfilter library I am aware of that helps you manipulate the rules. There is an internally driven library that is used instead though.

IPtables inherits a rather archaic method of speaking to userspace -- you open a SOCK_RAW IP socket to communicate with it. This is totally going to be removed (as it makes no sense) with nftables which will speak over netlink to do the same thing.