Sensu Keep alive check is failing


#1

I have sensu client running on different instances

I have configured sensu-check interval for 5minutes

From few days I have been noticing that sensu keep-alive checks are failing for few seconds like 4minutes and they are getting recovered

Can anyone suggest me what was the issue and how to provide handlers for keep alive failing check.


#2

You may want to look at other processes that run on your network during the time that you see the keep-alive checks fail (backup, cron, anything with high network IO, high cpu, memory, etc).

···

On Wednesday, November 9, 2016 at 10:16:55 PM UTC-5, IMRAN SHAIK wrote:

I have sensu client running on different instances

I have configured sensu-check interval for 5minutes

From few days I have been noticing that sensu keep-alive checks are failing for few seconds like 4minutes and they are getting recovered

Can anyone suggest me what was the issue and how to provide handlers for keep alive failing check.


#3

No, all other checks are working fine, except the keep alive.


#4

Have you tried checking the server time?
Maybe the time drifts off before ntpdate corrects it.

Are you using an ntp service or ntpdate through cron?

···

On Monday, November 14, 2016 at 6:56:18 AM UTC+1, IMRAN SHAIK wrote:

No, all other checks are working fine, except the keep alive.


#5

yes, your guess was correct, I have verified the server time and clients time, I found that there was 4-5 min gap between server and clients.

I am using ntp service

And I have also noticed that not even the keep-alive checks, all the other checks are also failing for few minutes and they are recovering back.

What’s the solution now?


#6

I have installed the ntp on server machine and the keep alive check fail has gone now, and clients are working perfectly.

Can I know how to configure the handler for keep-alive check failure.

I need this configuration very important,

Can anyone one help me regarding it.


#7

Comments inline.

I have installed the ntp on server machine and the keep alive check fail has gone now, and clients are working perfectly.

Can I know how to configure the handler for keep-alive check failure.

I need this configuration very important,

Can anyone one help me regarding it.

This is what my “client.json”.

{
"client": {
"name": “XXX",
"address": “XXX",
"environment": "ain1",
"keepalive": {
"handlers": [ "mailer", "sms", "logstash" ],
"thresholds": {
   "warning": 60,
   "critical": 90
}
},
"socket": {
"bind": "127.0.0.1",
"port": 3030
}
}
}

Regards.
@shankerbalan

···

On 21-Nov-2016, at 7:19 PM, IMRAN SHAIK <imranrehman5155@gmail.com> wrote: