Keepalive handler only sends slack alert on first warning

CJ_OTH · September 24, 2020, 12:04pm

We have been using the community version of sensugo for a while now at our company and are currently running version 5.19.1.

We have set up a slack handler, with the filters is_incident, not_silenced and hourly - works like a charm - when a check is down for an extended period of time we get alerts on our slack channel every hour.

This however is not true for keepalive checks. We only get one slack notification on the very first warning of an agent not responding =>120s and then nothing after that.

the keepalive handler is a copy’n’paste from the documentation (https://docs.sensu.io/sensu-go/5.19/reference/handlers/#keepalive-event-handlers) - nothing special here either.

My question is, are keepalive handlers not considered events(incidents)? and if not, is there a way to have these keepalive warnings posted to slack every hour as normal failed checks?

jspaleta · September 24, 2020, 11:27pm

Hey,

This is most likely do to the fact that your hourly filter expression assumes that keepalives events are being produced at some interval. But that’s not actually happening.

The keepalive events are NOT being generated at the regular interval what is happening is the entity’s keepalive timeout is being reached at the cadence associated with the keepalive-timeout for that entity.

So far example… lets assume the keepalive interval is 20 seconds. and the keepalive timeout is 120 seconds.

using the example hourly filter rule in the documentation… you are checking if failure occurrence count is equal to an hour’s work of keepalive event intervals 3600/20. But because no keepalive events are actually being sent to the backend, what is happen is that timeout keepalive events are being generated every 120 seconds… which is a slower rate than the hourly filter would calculate.

CJ_OTH · October 1, 2020, 8:45am

Thank you for your answers, that makes sense and are easily fixed by tweaking the hourly filter expression.

Topic		Replies	Views
Sensu Go / Slack Handler / Alerting in Slack like crazy Sensu Go	1	438	April 6, 2020
How to define critical limits for keepalive check? Sensu Go	2	1108	February 26, 2019
Keepalive + slack + fatigue_plugin Sensu Go	2	288	December 4, 2020
Simulate the "occurrences" check attribute from old Sensu Sensu Go	3	493	April 26, 2022
Keepalive check handler Sensu Classic (EOL)	1	433	July 23, 2015

Keepalive handler only sends slack alert on first warning

Related topics