Recieving Resolved mail for the check where fatigue setting is applied

Hi @Nasirhussen_Mulla

Resolved Alert: If the check fails equal to or more than the number of occurrences defined in “fatigue_check/occurrences”, then a resolved alert should only be sent after a successful resolution of the check.

Hi @Shivani_Bhardwaj

What are the parameters for triggering a Resolved Alert? Specifically, does it require just one successful occurrence after a critical alert has been generated, or must it have successful occurrences equal to all the occurrences defined in “fatigue_check/occurrences”?

For example, if we define the value as 3 for “fatigue_check/occurrences”:

  • A critical alert will be generated after 3 consecutive failed checks.
  • Should a resolved alert then be generated only after 3 consecutive successful checks following the critical alert?

Thank you for your clarification.

Hi @Nasirhussen_Mulla

If the value for “fatigue_check/occurrences” is set to 3, the alert behavior should follow this logic:

  • Critical Alert: A critical alert will be triggered only after 3 consecutive event failures. This means the system will not raise a critical alert on the first or second failure but will wait until the third failure to generate the alert.
  • Resolved Alert: A resolved alert should be triggered only when the most recent event is successful, and all the preceding events (within the history) were failures for at least the last 3 consecutive events. This ensures that a resolve alert is raised only after the issue was persistent for 3 or more consecutive failures and then successfully resolved.

Hope this helps

Hi @Shivani_Bhardwaj,

Thank you for the clarification. We now have a complete understanding of the requirements. Please note that there are some limitations with the fatigue check filter concerning using annotations in expression. We will do our best to develop the filter based on your requirements and will keep you updated accordingly.

Many Thanks,
Nasirhussen

Hi @Shivani_Bhardwaj,

As per the requirement we have prepared below mentioned filter and it is working fine for us. Please give another try and let us know how it goes for you.

type: EventFilter
api_version: core/v2
metadata:
name: fatigue-resolve-alert-filter
namespace: default
labels:
Sensu | Page not found sensuctl
created_by: sensu
spec:
action: allow
expressions:

(
(event.check.status != 0 && event.check.occurrences == event.check.annotations[“fatigue_check/occurrences”]) ||
(event.check.status == 0 &&
event.check.history[-1].status != 0 &&
event.check.history[-2].status != 0 &&
event.check.history[-3].status != 0 &&
event.check.occurrences >= event.check.annotations[“fatigue_check/occurrences”])
)

Many Thanks,
Nasirhussen

Hi @Nasirhussen_Mulla,

Here, we are using a static index value, which applies when fatigue_check/occurrences equals 3. However, we need to dynamically retrieve the value from fatigue_check/occurrences, as it won’t always be 3.

Hi @Shivani_Bhardwaj

Could you please confirm is it working as expected so that I can prepare complete dynamic expression based on above expression?

Many Thanks,
Nasirhussen

Hi @Nasirhussen_Mulla,

The check failed three times, but I still haven’t received a resolved email after the check was resolved following the three failed events.

Note:
We are not concerned about the critical email being triggered, as it is handled by the fatigue settings in the check configuration. Our main focus is on receiving the resolved alert after the n failed events

Hi @Shivani_Bhardwaj,

I’m experiencing some formatting issues while drafting the reply. Could you please reach out to me via email at my official address, n.mulla.ctr@sumologic.com? I would like to share the complete configuration of the check filter and handler, where everything is working correctly, for further discussion.

Many Thanks,

Hi @Nasirhussen_Mulla

I have sent you a mail