Snmpd update interface counters slowly or something like this

Solution 1:

This is usually related to the SNMP response not being received in a timely manner.
Because SNMP uses UDP that could mean network congestion or host congestion caused the request/reply to be lost, but more commonly one of the two machines involved simply couldn't get around to dealing with the request in a timely manner and the other machine got sick of waiting.

The chance of one machine or the other falling behind increases with workload -- If you have a lot of SNMP agents querying a particular host it may not service replies in as timely a manner as some of the agents expect (and those agents will show blank spots in the graphs, or report other errors).
Conversely if you have one agent querying a bunch of hosts - more than it can handle in your polling interval - the machines that don't get queried during the poll interval will have a gap in their graphs. (This problem was particularly common with Cacti's PHP poller, and lead to the development of cactid (now spine), which I strongly encourage you to use if you're not already using it).


My general advice on fixing this:

  1. Poll every 5 minutes, if possible.
    Most environments don't need 1/5/15/30/60/90/120 second polling intervals.
    If five-minute granularity is good enough for you, stick with it. It's less work for your servers, less work for your SNMP monitoring agents, and less data to store (or a longer period of time at "full granularity")

  2. Increase the SNMP timeout on your agents.
    Give the server more time to get around to your request. SNMP daemons are the lazy teenager of processes - you ask them to clean their room (or give you a tree's worth of data) on Monday, and on Wednesday or Thursday they might have picked up a few socks.

  3. Limit how much you're demanding from the server with each poll.
    If you just need one counter don't ask for the whole interfaces MIB -- it (usually) takes a longer time to walk the tree and generate full output than it does to just give you one OID.

  4. Limit how many agents are asking for data.
    If you can consolidate your monitoring to one box (Zabbix or Cacti) you'll be putting fewer demands on your server, and it's less likely to not respond in a timely manner.

If you're still having trouble after trying the above there is the ultimate debugging step: Hunt through your logs and Sniff the SNMP traffic. Make sure requests and responses are going back and forth in a timely manner and not being lost/rejected as malformed for some reason. Often looking at the data on the wire will give you a good indication of what's wrong and how to fix it.

Solution 2:

Which version of SNMP protocol do you use? SNMP v1 does not supports 64bit counters. It's an old issue with Cacti, just switch to "Version 2" on relevant "Device"