Keepalive + slack + fatigue_plugin

Im stuck trying to get this combo running.
I have been trying different approaches for two days now, and Im getting more confused by the hour.
The slack connection is working.
The keepalives keep coming in to slack if I delete a host.
Im not able to fine tune the behaviour though.

This is what I want in slack:

  • Warning if no keepalive after 120s.
  • Critical if no keepalive after 180s.
  • Reminder (critical) after twelve hours, 24h, 36h etc until fixed.
  • Notification when resolved

Im just looking for a working solution, I can skip the fatigue plugin if necessary.

config:
(no annotations on entities, but I have tried w. those as well)
(args in parenthesis for fatigue_check might be off, Im a bit tired and very frustrated right now)


type: Handler
api_version: core/v2
metadata:
name: keepalive
namespace: test
spec:
handlers:

  • slack
    type: set

api_version: core/v2
type: Handler
metadata:
namespace: test
name: slack
spec:
type: pipe
command: sensu-slack-handler --channel ‘#larm’ --username ‘larm’ --webhook-url https://hooks.slack.com/services/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
filters:

  • is_incident
  • fatigue_check(event,1, 43200, 1, 4300)
    runtime_assets:
  • sensu/sensu-slack-handler
    timeout: 10

type: EventFilter
api_version: core/v2
metadata:
name: fatigue_check
namespace: test
spec:
action: allow
expressions:

  • fatigue_check(event)
    runtime_assets:
  • fatigue-filter

Thanks in advance!

fatigue_checks handler may need specific agent annotations (.annotations.fatigue_check/*) .
Did you set those annotation for agent ?

Yes, sure,
fatigue_check/keepalive_occurrences: “1”
fatigue_check/keepalive_interval: “3600”

I even tried as arguments, as per docs:
“These can be set as
arguments to the fatigue_check() function in the filter definition or as
entity annotations to override the defaults on a per entity basis.”