How do I silence keepalives by subscription?

I am trying to silence the keepalives for a bunch of servers that are shutdown overnight. I want to specify this by a subscription.

If I specify the subscription with an entity for each server with a check of keepalive, the keepalive events are marked silenced. This works:

sensuctl silenced create --subscription 'entity:hostname1' \
              --check keepalive \
              --begin "2019-12-02 23:00:00 NZDT" \
              --expire 28800 \
              --reason 'server off overnight'

If I use this, the keepalive events are not marked as silenced fr any of the servers with that subscription:

sensuctl silenced create --subscription 'serveroff7-11' \
               --check keepalive
               --begin "2019-12-02 23:00:00 NZDT" \
               --expire 28800 \
               --reason 'server off overnight'

Do I have to silence each host or can I use a subscription to do this?

Sensu Go , Ubuntu Server 18.04

Thanks
Rob

@rabf are those servers named with predictable names? If so, why not use a filter on a keepalive handler?

Thanks @aaronsachs-sensu

We can’t use the server name to differentiate their down times, but I could use an annotation on the entity and add some filtering to the keepalive handler. It does mean adding a special case for each set of downtimes (7-7, 11-7, weekdays, all week etc)

A single silenced by subscription would cover all events for any downtime of these servers - but are keepalives not processed as part of a subscription?

Cheers
Rob

Hi @rabf,

So keepalives are a bit of a special beast inside of Sensu. They’re the only thing that agents ship with, so trying to handle them via subscription is not really an option. The other thing, if you wanted to keep it real simple, is just turn on deregistration on the agent side (see https://docs.sensu.io/sensu-go/5.15/reference/agent/#ephemeral-agent-configuration-flags). If you do that for servers you know are being shut down, it’ll depopulate them from the dashboard altogether. That reference doc also has some other attributes you could look at tweaking, like keepalive-timeout, but it sounds like just enabling deregistration would be the simplest way to handle things. And then when you turn the servers back on, the agents will report back in and carry on like normal.

Thanks for confirming that. Looks like a few options there - we’ll look at auto deregistration as well