Can remote logging with syslog-ng hang my application?

Solution 1:

No. For one, the handoff is asynchronous within the local operating system. syslog libraries and the local syslog daemon will either accept the message and fail to deliver it or fail-fast, but either way your app won't hang. Secondly the network protocol is (by default) udp so even if your application blocked until the packet got sent out, it would go out instantly and return control to your app regardless of if it actually makes it to the collecting host.

When people think of remote logging hanging things in *nix land its usually because they were logging to an nfs mount, which most certainly can cause hangs. Syslog, you're good.

Solution 2:

This can indeed happen - there are a number of situations where that kind of lock up can happen, and they all basically boil down to the syslog queue or buffer being full, so that writes are delayed.

That (generally) tends to compound the problem, because things start failing, and want to signal as much, but need to wait for syslog to accept their messages.

Do note that there are also bugs that can cause improper behaviour in such situations - notably, rsyslog caused this issue on RH (https://bugzilla.redhat.com/show_bug.cgi?id=519203). So I would definitely recommend checking your software versions against known bugs.

Also, check your syslog's DNS settings - for clients pushing out syslog, there is no reason I can think of to use DNS. For the receiving server, if you can do without DNS lookups, that might be worth trying to see if it helps throughput.

Fortunately, there are also a number of fixes (not specifically for syslog-ng), but you will need to make some kind of compromise, is the short version.

  1. If you can tolerate the loss of some data, switching your logging to UDP is an option. Obviously, given the kind of issue you are describing, it seems almost certain that if you do this, you will lose some data.

  2. Another option is being more selective about what you send across the network - i.e. filter, and/or prioritise, some flows over others. How much this helps depends in part on what options are available in your chosen syslog implementation - rsyslog has quite a lot of options, others I am not so familiar with.

  3. It isn't always necessary to log directly to the network. You could consider not doing so, and instead, using some kind of log tailing/parsing agent (something like https://www.elastic.co/products/logstash) - this can avoid touching a working syslog setup, while still having remote logging (you can also have the agent listen on localhost, and forward syslog data locally, if you don't currently store data to file).

  4. On a similar note, I would recommend you check your auditd policy, and see if there is anything their that could be causing a problem. Notably, if auditd is logging to syslog, the flow can be quite substantial, even (or especially) when using 'best practice' configs (e.g. CIS benchmarks). I have seen this cause problems in several areas, and in some cases, when audispd can no longer push messages to syslog, it may block.

  5. Finally, for things like rsyslog, you also have options use disk and memory queues to alleviate these kinds of issues. It takes a bit of setup (for rsyslog, see http://www.rsyslog.com/doc/v8-stable/concepts/queues.html), but does allow building a much more fault-tolerant setup, if you don't mind throwing some resources at the problem.

Rsyslog provides a guide for high performance setups (http://www.rsyslog.com/doc/v8-stable/examples/high_performance.html), and failover syslog servers (http://www.rsyslog.com/doc/v8-stable/tutorials/failover_syslog_server.html). I would definitely recommend you at least investigate the central log server to make sure it is able to deal with the volume of logging - and tune it otherwise (I've had good experiences doing this with rsyslog, where a fairly 'standard' receiver config was unable to keep up, but tuning it allowed us to support several orders of magnitude more traffic).

Also, consider reviewing your logging configuration more generally - I know from (sad) experience that there can be a tendency for people to enable TRACE or DEBUG logging and leave it on, which generally does not do syslog (or the system more generally) too many favours.