Handling and alerting on Aggregate check results


#1

Hello Sensu Community!

I have several systems that are setup in clusters, they perform the same kinds of cluster health status checks. As a result, if a warning or critical occurs, each client sends a result and the alarms go off 3 times at once.

I am trying to see if there’s a way to handle the single result coming from an Aggregate check instead of handling all of the individual checks.

I have tried setting "handle": false in my check, but then none of the checks nor the aggregate check itself is handled.

It does not seem that the Aggregate check goes through my Handler.

Ubuntu 16.04 Xenial
Sensu Enterprise 3.2.2
redis 2.8.20
rabbitmq 3.7.8
Sensu Enterprise Dashboard 1.2.12
Sensu Core (clients) 1.6.1


#2

Hey Jordan, in order to know more about your setup I would need to see the relevant configuration to know what’s not working for you.

I recently wrote a series of blog posts on dealing with alert fatigue and in part 4 I do briefly go over aggregates in general but if you are looking for something a bit more in depth then I would suggest checking out this post. I think the missing piece is to use a proxy/JIT client so that it essentially creates a custom client in sensu so it will only be handled for a single client regardless of how many clients are checking via aggregates or round robin. To do this you would add the "source": "MY_CLUSTER_NAME" to your check.


#3

Thanks for those links! I think your blog post was really helpful for me to understand the big picture, and I think the piece of the puzzle I was missing was the additional check using check-aggregate.rb.