Occasionally our Sensu cluster gets in to a state where the RabbitMQ “results” queue piles up with thousands of events, and keeps getting bigger, while all Sensu server nodes have low 50% CPU usage on them.
Previously we’ve solved this problem by stopping all Sensu server nodes, which deletes the “results” queue, then clearing all “history:" and "results:” keys from Redis so fewer events get generated when the Sensu server nodes start up again.
While purging queues and Redis keys has worked in the past, it’s not working now. Our Sensu cluster keeps piling up messages.
Does anyone have any idea on what to do to solve this?
We’re not running any handlers or filters, so the Sensu servers aren’t busy waiting on I/O. CPU usage on the RabbitMQ node is less than 25%. We’re running Sensu 0.24.1-1 from the Ubuntu Apt repository.