Nagios: NRPE: Unable to read output, Can't find the reason, can you?
Nice detailed write-up Itai! Have you tried reducing the complexity of the config to see if it works?
For starters, I would start by changing the line in nrpe.cfg
to
command[check_kvm]=/usr/lib64/nagios/plugins/check_kvm
and temporarily change the /usr/lib64/nagios/plugins/check_kvm script to be something really simple like:
#!/bin/sh
echo Hi
exit 0
If that works, then you can start ratcheting up the complexity. Perhaps instead of giving the nagios
user sudo access to the script, it really needs access to the virsh
command and you can leave out the sudo
part in the nrpe.cfg
command line.
I saw a problem on a Gentoo server that resembles to yours at http://forums.gentoo.org/viewtopic-t-806014-start-0.html
there is a nice method there to debug the issue.
the user on that post had a problem with check_disk and got the exact same error message as yours.
he was told to execute the following command:
ssh remote_ip /usr/lib/nagios/plugins/check_disk -w 10 -c 5 -p "/" 2>&1
the 2>&1
will output stderr and might reveal the exact error.
so in your case replace remote_ip with the ip address of the server can't execute check_nrpe on. and replace the check_disk command with the full command that check_kvm is supposed to execute. if you run it without any parameters so you can just go and execute
ssh <remote_ip> /usr/lib64/nagios/plugins/check_kvm 2>&1
that hopefully will reveal information regarding the problem.
good luck!
I had the same issue and I manage to solve it by killing the nagios process (on the monitored machine):
ps -ef | grep nagios
kill -9 [NagiosProcessNumber]
/etc/init.d/nagios-nrpe-server start
All went fine after that.