So I’ve set up a monitoring system with sensu that emails me about certain services on my vms being in critical condition. However, I’ve configured the emails to only be sent if the vms are in critical condition for over an hour by using filters. Now, I would like to send an email when the VMs are no longer in critical condition.
For example, if the memory usage was over 90% for two hours, I would get one email per hour, resulting in two emails. Then, say shortly after that, the memory usage went below 90%, so the handler switched back to state 0. In this case, I would like to get an email saying that the memory is below 90% (no longer in critical condition).
I’ve looked at the “resolve” documentation and it seems like this is what I would want to use. Here’s where I get stuck: How can I get the resolve to only email me if the critical condition lasted for more than an hour? So, if we took the previous example, but changed it so that the memory usage was over 90% for only half an hour, in this case I would receive 0 emails because it was not in critical condition for more than an hour. However, it still switched from critical condition to normal condition, so theoretically I would still get a resolve email even though I don’t want one. How can I only get the resolve email if the critical condition lasted for at least an hour, or for say, at least 120 occurences (running the check once every 30 sec, 30*120 = 1 hour).