I’m using sensu-server and sensu-agent of version 6.4.2. Below is my setup.
3 node sensu-backend / etcd cluster
3 node cluster of sensu-server
11 agents running in 11 nodes
I’m facing an issue where all of a sudden a check stops running (getting scheduled). Automatically it’ll resume back after few min or few hours. During this period, there is no logs for this particular check. This issue is very inconsistent.
Not sure is it anything to do with how the check gets scheduled (using cron or interval). I have tried both.
Below are my test checks.
Sensu server logs
{
“api_version”:“core/v2”,
“type”:“Check”,
“metadata”:{
“namespace”:“default”,
“name”:“check1”,
“annotations”: {
“fatigue_check/occurrences”: “5”,
“fatigue_check/interval”: “3600”,
“sensu.io.json_attributes”: “{"type":"standard","occurrences":5,"refresh":3600}”
}
},
“spec”:{
“command”:“python3.6 /etc/sensu/plugins/check1.py”,
“subscriptions”:[
“worker”
],
“publish”:true,
“round_robin”:true,
“interval”:60,
“handlers”:[
“tester_handler”
],
“proxy_entity_name”:“proxyclient”,
“timeout”:50
}
}
Summary
This text will be hidden