keepalives are failing

windowsrefund · May 29, 2013, 5:07pm

I have a few “problematic” nodes where it seems that keepalives continue to fail. The problem is not clock or network related as one of the nodes in question is my sensu server

Here’s some client debug output showing the keepalive is published every 20 seconds:

{“timestamp”:“2013-05-29T13:02:05.787062-0400”,“level”:“debug”,“message”:“publishing keepalive”,“payload”:{“name”:“oreo”,“address”:“10.250.250.81”,“subscriptions”:[“all”,“sensu server”,“openvzve”],“timestamp”:1369846925}}
{“timestamp”:“2013-05-29T13:02:25.787971-0400”,“level”:“debug”,“message”:“publishing keepalive”,“payload”:{“name”:“oreo”,“address”:“10.250.250.81”,“subscriptions”:[“all”,“sensu server”,“openvzve”],“timestamp”:1369846945}}
{“timestamp”:“2013-05-29T13:02:45.788884-0400”,“level”:“debug”,“message”:“publishing keepalive”,“payload”:{“name”:“oreo”,“address”:“10.250.250.81”,“subscriptions”:[“all”,“sensu server”,“openvzve”],“timestamp”:1369846965}}

and yet, we see nothing but error states in redis:

akosmin@oreo:~$ redis-cli lrange history:oreo:keepalive 0 -1

“2”
“2”
“2”
“2”
“2”
“2”
“2”
“2”
“2”
“2”
“2”
“2”
“2”
“2”
“2”
“2”
“2”
“2”
“2”
“2”
“2”

windowsrefund · May 29, 2013, 5:11pm

Forgot to mention this is 0.9.13

Quenten_Griffith · June 26, 2013, 2:18pm

I am not sure if you have fixed this yet or not but I ran into the same issue. My fix was stopping the client, then removing the client using the dashboard, and starting the client back up.

···

On Wednesday, May 29, 2013 1:07:39 PM UTC-4, windowsrefund wrote:

I have a few “problematic” nodes where it seems that keepalives continue to fail. The problem is not clock or network related as one of the nodes in question is my sensu server

Here’s some client debug output showing the keepalive is published every 20 seconds:

{“timestamp”:“2013-05-29T13:02:05.787062-0400”,“level”:“debug”,“message”:“publishing keepalive”,“payload”:{“name”:“oreo”,“address”:“10.250.250.81”,“subscriptions”:[“all”,“sensu server”,“openvzve”],“timestamp”:1369846925}}
{“timestamp”:“2013-05-29T13:02:25.787971-0400”,“level”:“debug”,“message”:“publishing keepalive”,“payload”:{“name”:“oreo”,“address”:“10.250.250.81”,“subscriptions”:[“all”,“sensu server”,“openvzve”],“timestamp”:1369846945}}
{“timestamp”:“2013-05-29T13:02:45.788884-0400”,“level”:“debug”,“message”:“publishing keepalive”,“payload”:{“name”:“oreo”,“address”:“10.250.250.81”,“subscriptions”:[“all”,“sensu server”,“openvzve”],“timestamp”:1369846965}}

and yet, we see nothing but error states in redis:

akosmin@oreo:~$ redis-cli lrange history:oreo:keepalive 0 -1

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

Sean_Porter · July 5, 2013, 11:37pm

Can you check the client information in Redis to verify that the timestamp is/isn’t being updated (using redis-cli)? Quenten’s solution seems to indicate that there may be a bug.

···

On Wednesday, 29 May 2013 10:07:39 UTC-7, windowsrefund wrote:

I have a few “problematic” nodes where it seems that keepalives continue to fail. The problem is not clock or network related as one of the nodes in question is my sensu server

Here’s some client debug output showing the keepalive is published every 20 seconds:

{“timestamp”:“2013-05-29T13:02:05.787062-0400”,“level”:“debug”,“message”:“publishing keepalive”,“payload”:{“name”:“oreo”,“address”:“10.250.250.81”,“subscriptions”:[“all”,“sensu server”,“openvzve”],“timestamp”:1369846925}}
{“timestamp”:“2013-05-29T13:02:25.787971-0400”,“level”:“debug”,“message”:“publishing keepalive”,“payload”:{“name”:“oreo”,“address”:“10.250.250.81”,“subscriptions”:[“all”,“sensu server”,“openvzve”],“timestamp”:1369846945}}
{“timestamp”:“2013-05-29T13:02:45.788884-0400”,“level”:“debug”,“message”:“publishing keepalive”,“payload”:{“name”:“oreo”,“address”:“10.250.250.81”,“subscriptions”:[“all”,“sensu server”,“openvzve”],“timestamp”:1369846965}}

and yet, we see nothing but error states in redis:

akosmin@oreo:~$ redis-cli lrange history:oreo:keepalive 0 -1

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

“2”

Topic		Replies	Views
cannot get rid of keepalive's Sensu Classic (EOL)	4	623	May 18, 2016
New install keepalive fails for all clients Sensu Classic (EOL)	2	449	November 22, 2018
Sensu testing -- second machine stopped sending keepalives! Sensu Classic (EOL)	16	473	January 23, 2015
No keepalive sent from client Sensu Classic (EOL)	1	732	September 14, 2016
sensu-servers seem to sometimes just stop processing keepalive events Sensu Classic (EOL)	0	470	March 1, 2016

keepalives are failing

Related topics