FreeBSD's ng_nat stopping pass the packets periodically
I have FreeBSD router:
#uname
9.1-STABLE FreeBSD 9.1-STABLE #0: Fri Jan 18 16:20:47 YEKT 2013
It's a powerful computer with a lot of memory
#top -S
last pid: 45076; load averages: 1.54, 1.46, 1.29 up 0+21:13:28 19:23:46
84 processes: 2 running, 81 sleeping, 1 waiting
CPU: 3.1% user, 0.0% nice, 32.1% system, 5.3% interrupt, 59.5% idle
Mem: 390M Active, 1441M Inact, 785M Wired, 799M Buf, 5008M Free
Swap: 8192M Total, 8192M Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
11 root 4 155 ki31 0K 64K RUN 3 71.4H 254.83% idle
13 root 4 -16 - 0K 64K sleep 0 101:52 103.03% ng_queue
0 root 14 -92 0 0K 224K - 2 229:44 16.55% kernel
12 root 17 -84 - 0K 272K WAIT 0 213:32 15.67% intr
40228 root 1 22 0 51060K 25084K select 0 20:27 1.66% snmpd
15052 root 1 52 0 104M 22204K select 2 4:36 0.98% mpd5
19 root 1 16 - 0K 16K syncer 1 0:48 0.20% syncer
Its tasks are: NAT via ng_nat and PPPoE server via mpd5.
Traffic through - about 300Mbit/s, about 40kpps at peak. Pppoe sessions created - 350 max.
ng_nat is configured by by the script:
/usr/sbin/ngctl -f- <<-EOF
mkpeer ipfw: nat %s out
name ipfw:%s %s
connect ipfw: %s: %s in
msg %s: setaliasaddr 1.1.%s
There are 20 such ng_nat nodes, with about 150 clients.
Sometimes, the traffic via nat stops. When this happens vmstat reports a lot of FAIL counts
vmstat -z | grep -i netgraph
ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP
NetGraph items: 72, 10266, 1, 376,39178965, 0, 0
NetGraph data items: 72, 10266, 9, 10257,2327948820,2131611,4033
I was tried increase
net.graph.maxdata=10240
net.graph.maxalloc=10240
but this doesn't work.
It's a new problem (1-2 week). The configuration had been working well for about 5 months and no configuration changes were made leading up to the problems starting.
In the last few weeks we have slightly increased traffic (from 270 to 300 mbits) and little more pppoe sessions (300->350).
Help me please, how to find and solve my problem?
Upd: Info about network cards:
# pciconf -lv | grep -B3 network
em0@pci0:0:25:0: class=0x020000 card=0x35788086 chip=0x15028086 rev=0x05 hdr=0x00
vendor = 'Intel Corporation'
device = '82579LM Gigabit Network Connection'
class = network
--
em1@pci0:2:0:0: class=0x020000 card=0x35788086 chip=0x10d38086 rev=0x00 hdr=0x00
vendor = 'Intel Corporation'
device = '82574L Gigabit Network Connection'
class = network
UPD: There is 2 "top" output https://gist.github.com/korjavin/9190181
when I swith net.isr.dispatch to hybrid. After this, I have tons of mpd processes (don't know why) and one CPU to 100% of interrupt, and after 10 minutes of work it was rebooted, due to big packet lost.
UPD: Happened again There is "top" output before reboot and after https://gist.github.com/korjavin/9254734
looks like problem in ng_queue proccess, which eating CPU to much. Since my first post, there much more sessions and traffics. About 400 pppoe , and 450Mbit/s
I'd try bumping net.link.ifqmaxlen in /boot/loader.conf to 10240. As I underestand it, the em(4) (and igp, the Intel 10g card) driver (or at least your 82574L) won't balance non-IP traffic (your pppoe) so everything goes in to one ng_queue.
I don't understand why one of your interfaces (em0) is using one IRQ while the other (em1) is using separate IRQs for tx, rx, and link. Are both NIC cards in MSI-X capable slots?
You can probably make more sense of this than I can (I don't know Russian, and Google translate doesn't help much):
http://forum.nag.ru/forum/index.php?s=c4da62052515736f45c73b932216af33&showtopic=82322&st=0
This thread from the FreeBSD forums has some suggestions
The FreeBSD wiki on Network Performance Tuning explains a little bit about single-threading in ng_nat and some workarounds
Some people have reported success disabling IPv6 in the kernel (and in mpd) but I don't see any real consensus there.
EDIT: I forgot to add this one,, seems to have several other relevant tuning parameters, I thought the dummynet related ones looked promising.
Let me know what happens, this is an interesting problem...