Sensu sometimes not consuming results queue

fofloinn · December 13, 2018, 9:08am

Hi,

We seem to be having some issues with Sensu not consuming the results queue which built up around 200,000 messages. After this was left alone for around 1hr, the messages on the results queue eventually got drained.

This happened once in the morning, then again in the afternoon.

When troubleshooting, I changed the serverid in task:check_result_monitor:server to the other host in our cluster in redis and it then consumed the messages.

Im not sure whats happening here.

During this issue, I could see some warnings in the logs on on of our sensu server hosts:

{“timestamp”:“2018-12-12T08:39:02.722432+0000”,“level”:“warn”,“message”:“another sensu server is responsible for the task”,“task”:“client_monitor”}
{“timestamp”:“2018-12-12T08:39:13.352582+0000”,“level”:“warn”,“message”:“another sensu server is responsible for the task”,“task”:“check_result_monitor”}

2 node sensu cluster (v1.6.1)
3 node rabbitmq cluster (v3.7.7)
3 node redis cluster (redis-sentinel) v3.2.12
OS of all nodes RHEL 7.5

Any help with this issue would be much appreciated.

Cheers,
Fearghal

richard · December 19, 2018, 6:11pm

Hello fofloinn,
Do you happen to know which one of the two sensu servers that log message came from? Did it happen before or after the Redis change? If so, was it on the original sensu server elected to the task check_result_monitor:server or the one you changed it to? Are you able to provide the sensu server log messages during that initial build up and drain down?

As to what to monitor for if this happens again:

Which one of the servers has the task check_result_monitor:server. You can find that out by hitting the API endpoint info. For example: curl http://127.0.0.1/info if ran on the server where the API is running.
Log entries for the server with the check_result_monitor:server task, as well as for the one without that task. Make note of which is which.
Is the keepalive queue showing any build up?

Regards,
Richard.

aaronsachs · April 1, 2019, 10:00pm

This topic was automatically closed after 20 days. New replies are no longer allowed.

Topic		Replies	Views
Sensu RabbitMQ "results" queue piling up with low CPU on servers, RabbitMQ Sensu Classic (EOL)	7	830	November 22, 2018
Sporadic RabbitMQ result/keepalive queue processing issues Sensu Classic (EOL)	2	538	November 22, 2018
Sensu problems - communication(?) Sensu Classic (EOL)	1	432	April 20, 2016
sensu-servers seem to sometimes just stop processing keepalive events Sensu Classic (EOL)	0	470	March 1, 2016
sensu server hitting 100% cpu Usage Sensu Classic (EOL)	2	1640	November 6, 2018

Sensu sometimes not consuming results queue

Related topics