Zabbix Trigger Hysteresis - Not returning to normal
I've done some experiments:
The agent.ping writes a "1" value if reachable and doesn't write anything if unreachable; so even if your agent is unreachable for 2 hours, the last value is 1. This means that .min(), .avg() etc... always works on a list of "1" values.
The .nodata() function does not help with rebounds as well: it returns "1" only if it hasn't received any data for the entire time interval, "0" otherwise.
For instance, .nodata(20m) on a 60sec item will return:
- 1: if no data received for the entire 20m time range (20 empty values)
- 0: if everything's ok (20 full values)
- 0: for everything inbetween (ie: 5 ok, 5 minutes of unreach, 10 ok)
I have found workaround, assuming that you check the agent reachability every 60 seconds:
({TRIGGER.VALUE}=0 and {Template App Zabbix Agent:agent.ping.nodata(5m)}=1) or ({TRIGGER.VALUE}=1 and {Template App Zabbix Agent:agent.ping.count(20m,1)}<20)
The expression will trigger after 5 minutes of unreachability and recover only when you have 20 "1" values in the last 20 minutes.
Not too elegant, but it works.