Sensu agent hangs after logging this error: "detected missing amqp heartbeats"

Tom_Pride · August 18, 2014, 2:46am

Hi Guys,

We’ve recently found that several of our servers running the sensu client version 0.12.2-1 on Centos 6.4 x86_64 have hung immediately after writing the following error to their /var/log/sensu/sensu-client.log:

{“timestamp”:“2014-08-16T01:32:41.841667+1000”,“level”:“error”,“message”:“detected missing amqp heartbeats”}

``

Once the sensu-client has logged the above error its process does not exit, but it does essentially appear to have hung, as it no longer writes anything to its logs and the following event is reported in the sensu server dashboard:

lwa0002.mydomain.com
keepalive
No keep-alive sent from client in over 180 secondsThe only way I have found to recover from this state is to restart the sensu-client. I’ve looked at our sensu server performance graphs to see what the sensu server was doing at the time that the sensu-clients logged “detected missing amqp heartbeats” and the server’s load had spiked up to well over 40, so I’m confident that the high load on the sensu server is the cause of the clients logging the error. Even so, my argument is that this is possibly a bug with the sensu-client. It shouldn’t hang after logging the error, it should either recover once it starts getting a response again from rabbitMQ or at the very least, cleanly exit. The sensu-client shouldn’t just hang around waiting for some sort human intervention.

Any advise or input in regard to this issue would be very much appreciated.

Cheers,

Tom

Topic		Replies	Views
sensu-server hangs after a short period Sensu Classic (EOL)	2	493	April 29, 2015
No keep-alive sent from client Sensu Classic (EOL)	1	501	November 22, 2018
sensu-server and sensu-api down Sensu Classic (EOL)	2	576	October 1, 2014
sensu-servers seem to sometimes just stop processing keepalive events Sensu Classic (EOL)	0	470	March 1, 2016
Just sharing a list of problems I've found Sensu Go	4	481	October 9, 2019

Sensu agent hangs after logging this error: "detected missing amqp heartbeats"

Related topics