Nagios Ping Problem
I'm using Nagios3 in my Debian 6 box for monitoring my network. My system pings www.google.com periodically to check whether the Internet connection is okay or not.
The weird thing happened is that, when Nagios checks whether www.google.com is reachable or not, it says "Network not found". However, I can ping www.google.com manually.
After a bit of googling, here's what I found.
8-Jun was IPv6 day. A large portion used (even currently using) IPv6. When my Nagios Server tried to communicate with google.com, it tried to communicate with the IPv6 address. My Nagios Server itself does not have any IPv6 address, so naturally, the communication did not work.
So, I forced Nagios to use IPv4, and it worked like a charm
Now that the problem has been identified, time to tweak Nagios so it uses IPv4 to check whether a host is alive.
And it is done. Hope this helps. ^_^
Reference: http://serverfault.com/questions/278196/nagios-bizare-ping-behaviour
The weird thing happened is that, when Nagios checks whether www.google.com is reachable or not, it says "Network not found". However, I can ping www.google.com manually.
root@dragonfly:~# /usr/lib/nagios/plugins/check_ping -H www.google.com -c 100,90% -w 100,90%
CRITICAL - Network Unreachable (www.google.com)
root@dragonfly:~# ping www.google.com
PING www.l.google.com (74.125.236.208) 56(84) bytes of data.
64 bytes from maa03s17-in-f16.1e100.net (74.125.236.208): icmp_req=1 ttl=53 time=210 ms
64 bytes from maa03s17-in-f16.1e100.net (74.125.236.208): icmp_req=2 ttl=53 time=229 ms
^C
--- www.l.google.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 210.611/219.867/229.124/9.268 ms
After a bit of googling, here's what I found.
8-Jun was IPv6 day. A large portion used (even currently using) IPv6. When my Nagios Server tried to communicate with google.com, it tried to communicate with the IPv6 address. My Nagios Server itself does not have any IPv6 address, so naturally, the communication did not work.
So, I forced Nagios to use IPv4, and it worked like a charm
root@dragonfly:~# /usr/lib/nagios/plugins/check_ping -4 -H www.google.com -w 100,90% -c 100,90%
PING OK - Packet loss = 0%, RTA = 73.09 ms|rta=73.092003ms;100.000000;100.000000;0.000000 pl=0%;90;90;0
Now that the problem has been identified, time to tweak Nagios so it uses IPv4 to check whether a host is alive.
root@dragonfly:~# vim /etc/nagios-plugins/config/ping.cfg
##### ADDING -4 PARAMETER #####
# 'check-host-alive' command definition
define command{
command_name check-host-alive
command_line /usr/lib/nagios/plugins/check_ping -4 -H '$HOSTADDRESS$' -w 5000,100% -c 5000,100% -p 1
}
root@dragonfly:~# /etc/init.d/nagios3 restart
And it is done. Hope this helps. ^_^
Reference: http://serverfault.com/questions/278196/nagios-bizare-ping-behaviour
That's cool then vendors are giving default preference to IPv6
ReplyDeleteThanks - saved me some brain-ache!
ReplyDeleteGreat stuff - Thanks!
ReplyDeleteCurrent nagios has a check-host-alive_4, you can use this as check_command
ReplyDeleteAwesome, thanks!
ReplyDelete