What is a check_mk sticky comment when acknowledging a host/service?
I wanted to attach a comment to a system that is being monitored with Nagios. I prefer using check_mk as GUI. Now I stumbled across this: I can set a comment as sticky and/or persistent.
So I asked our Nagios-admin what the difference between sticky and persistent would be.
It turned out, that he did not know about "sticky" - this has to be something check_mk-specific.
After a Google and a review of check_mk documents I could not find anything about that topic.
So: What is the difference between sticky and persistent for Nagios-service-comments?
Update: Here is a screenshot - check_mk quicksearch for a specific server, then select the hamer-symbol. Then this will show up:
The question is about the Acknowledge-Box: sticky vs. persistent
Solution 1:
I'll answer with some gritty details. Jenny D is to the point but I'd like to be more precise about "no further alarms".
Normally, Nagios will notifiy you on each status change:
- So if your service becomes "WARN", you get a notification.
- You acknoweledge the service now, and will not get another (i.e. perioditc) notification as long as the service stays in the "WARN" state.
- If it traverses to "CRIT", you get a notification.
- If it goes back to "WARN", you get a notification.
- If it then goes to "OK", you get a recovery notification.
- After that, acknowledgment is expired since it becomes "OK"
In the sticky scenario, the will be no notifications about traversals between problem states:
- So if your service becomes "WARN", you get a notification.
- You acknoweledge the service now with the sticky option set.
- If it traverses to "CRIT", you get no notification.
- If it goes back to "WARN", you get no notification.
- If it then goes to "OK", you get a recovery notification.
- After that, the sticky setting is removed since it's a property of the acknowledgment - expired since it becomes "OK"
In human terms:
Not setting the sticky option means: I'm working on the issue, but this will take a while, for example, while it's just a WARN I'm not authorized to map a new disk. If suddenly stuff escalates and the filesystem fills to CRIT, I need to know since then we move from proactive maintenance to an emergency fix.
The sticky option allows you to chose some other way of doing it. I'm working on the issue and will keep an eye on it while I work. During my work, it can worsen temporarly until I'm DONE and then it'll be fine
FYI: If you use the persistent comment option, the acknowledgement will be gone, but the text you entered will remain
Solution 2:
The question is about the Acknowledge-Box: sticky vs. persistent
OK, they are what I have described in the above comment. Take a look at this for more details:
If the "sticky" option is set to one (1), the acknowledgement will remain until the host returns to an UP state. Otherwise the acknowledgement will automatically be removed when the host changes state.
If the "persistent" option is set to one (1), the comment associated with the acknowledgement will survive across restarts of the Nagios process. If not, the comment will be deleted the next time Nagios restarts.
Solution 3:
The "sticky" here means a "sticky acknowledgement" = there will be no further alarms until this issue has been resolved. In other words, the fact that you've acknowledged it will stick to the fault even if the same fault keeps generating alarms. (Of course, this lasts until the current problem has been resolved and the issue stops generating alarms - the next time that it fails, it will again generate alarms.)