send_nrdp.sh not working with arguments when used as an obsessive compulsive command
box293 opened this issue · comments
In send_nrdp.sh
this if statement is being used to determine if STDIN is being used:
if [ ! -t 0 ]; then
For the situations when you are piping to the script.
I was configuring the script as an obsessive compulsive command for hosts and services. This was on a Nagios Core server sending results back to a central Nagios XI server. I was using the script in argument mode, passing all the arguments instead of piping them. When I do this, the $xml
variable in the script gets reset to nothing and the script fails. You can see this in /var/log/messages
with stdout line 01: ERROR: The NRDP Server said BAD XML
.
Here are the command definitions:
define command{
command_name send_nrdp_host
command_line $USER1$/send_nrdp.sh -u https://10.25.5.17/nrdp/ -t XXXXX -H '$HOSTNAME$' -S $HOSTSTATEID$ -o '$HOSTOUTPUT$'
}
define command{
command_name send_nrdp_service
command_line $USER1$/send_nrdp.sh -u https://10.25.5.17/nrdp/ -t XXXXX -H "$HOSTNAME$" -s "$SERVICEDESC$" -S $SERVICESTATEID$ -o "$SERVICEOUTPUT$"
}
Here is nagios.cfg
obsess_over_services=1
ocsp_command=send_nrdp_service
obsess_over_hosts=1
ochp_command=send_nrdp_host
Here is how you can reproduce / troubleshoot.
Make the following changes to the script:
On line 2 add this:
echo -e "`ls -la /proc/$$/fd`"
On line 220 (before if [ ! -t 0 ]; then
) add this:
echo $xml
On line 224 (after if [ ! -t 0 ]; then
) add this:
echo here
On line 255 (after fi
) add this:
echo $xml
Now if you test from the command line:
/usr/local/nagios/libexec/send_nrdp.sh -u https://10.25.5.17/nrdp/ -t XXXXX -H "host" -s "service" -S 0 -o "test"
The output will be something like:
total 0
dr-x------. 2 root root 0 Aug 15 15:01 .
dr-xr-xr-x. 9 root root 0 Aug 15 15:01 ..
lrwx------. 1 root root 64 Aug 15 15:01 0 -> /dev/pts/2
lrwx------. 1 root root 64 Aug 15 15:01 1 -> /dev/pts/2
lrwx------. 1 root root 64 Aug 15 15:01 2 -> /dev/pts/2
lr-x------. 1 root root 64 Aug 15 15:01 255 -> /usr/local/nagios/libexec/send_nrdp.sh
lr-x------. 1 root root 64 Aug 15 15:01 3 -> pipe:[822644]
<checkresult type='service' checktype='1'><servicename>service</servicename><hostname>host</hostname><state>0</state><output><![CDATA[test]]></output></checkresult>
<checkresult type='service' checktype='1'><servicename>service</servicename><hostname>host</hostname><state>0</state><output><![CDATA[test]]></output></checkresult>
Sent 1 checks to https://10.25.5.17/nrdp/
You can see that the if statement was not true and the then
block was not executed, hence the $xml variable is not overwritten with this line of code:
xml=""
To see it fail when Nagios Core uses it, execute this command:
tail -f /var/log/messages
Now go to a service object and force a check, this will cause the obsessive command to be executed, the output will be something like:
Aug 15 15:04:12 core-015 nagios: wproc: OCSP job 22 from worker Core Worker 1973 is a non-check helper but exited with return code 2
Aug 15 15:04:12 core-015 nagios: wproc: host=DNS1; service=SSH; contact=(none)
Aug 15 15:04:12 core-015 nagios: wproc: early_timeout=0; exited_ok=1; wait_status=512; error_code=0;
Aug 15 15:04:12 core-015 nagios: wproc: stdout line 01: total 0
Aug 15 15:04:12 core-015 nagios: wproc: stdout line 02: dr-x------. 2 nagios nagios 0 Aug 15 15:04 .
Aug 15 15:04:12 core-015 nagios: wproc: stdout line 03: dr-xr-xr-x. 9 nagios nagios 0 Aug 15 15:04 ..
Aug 15 15:04:12 core-015 nagios: wproc: stdout line 04: lr-x------. 1 nagios nagios 64 Aug 15 15:04 0 -> /dev/null
Aug 15 15:04:12 core-015 nagios: wproc: stdout line 05: l-wx------. 1 nagios nagios 64 Aug 15 15:04 1 -> pipe:[823219]
Aug 15 15:04:12 core-015 nagios: wproc: stdout line 06: l-wx------. 1 nagios nagios 64 Aug 15 15:04 2 -> pipe:[823220]
Aug 15 15:04:12 core-015 nagios: wproc: stdout line 07: lr-x------. 1 nagios nagios 64 Aug 15 15:04 255 -> /usr/local/nagios/libexec/send_nrdp.sh
Aug 15 15:04:12 core-015 nagios: wproc: stdout line 08: lr-x------. 1 nagios nagios 64 Aug 15 15:04 3 -> pipe:[823223]
Aug 15 15:04:12 core-015 nagios: wproc: stdout line 09: <checkresult type='service' checktype='1'><servicename>SSH</servicename><hostname>DNS1</hostname><state>0</state><output><![CDATA[SSH OK - OpenSSH_6.7p1 Raspbian-5+deb8u1 (protocol 2.0)]]></output></checkresult>
Aug 15 15:04:12 core-015 nagios: wproc: stdout line 10: here
Aug 15 15:04:12 core-015 nagios: wproc: stdout line 11:
Aug 15 15:04:12 core-015 nagios: wproc: stdout line 12: ERROR: The NRDP Server said BAD XML
You can see that stdout line 09 shows the content of the $xml variable.
Then stdout line 10 shows that the if statement was true and the then
block was executed, hence the $xml variable is overwritten with this line of code:
xml=""
You can see that stdout line 11 shows the empty $xml variable.
Tested on CentOS 7.x, Nagios Core 4.3.2, send_nrdp.sh from master.
Fixed in merged pull #22