sensu-server and sensu-api down


#1

Hi,

Recently I installed Sensu to monitor a production datacenter, versions 0.13.0 and 0.13.1, and I have found already 3 times in the last 2 or 3 weeks that sensu-server and sensu-api had exited on their own.

Right now I am updating to 0.14.0, hoping that it will solve the problems, but I’m thinking if sensu is mature enough to be used in production?

Is there any misconfiguration problem that could lead to those processes to die?

Is anyone else monitoring a production environment with Sensu? Are you happy with it?

Thanks.


#2

I’m monitoring a multi-billion dollar ecommerce site with it, so yes.

Do you have errors in the log when it exited?

Thanks,

Bryan

···

On Tue, Sep 30, 2014 at 6:43 AM, Pablo Relayr pablo@relayr.de wrote:

Hi,

Recently I installed Sensu to monitor a production datacenter, versions 0.13.0 and 0.13.1, and I have found already 3 times in the last 2 or 3 weeks that sensu-server and sensu-api had exited on their own.

Right now I am updating to 0.14.0, hoping that it will solve the problems, but I’m thinking if sensu is mature enough to be used in production?

Is there any misconfiguration problem that could lead to those processes to die?

Is anyone else monitoring a production environment with Sensu? Are you happy with it?

Thanks.


#3

Hi,

yes, there are errors related to AMQP. Whenever the connection whith Rabbit dies, it seems the client and the api exit.

In the Rabbit logs I can find this when the sensu-api has died:

=ERROR REPORT==== 1-Oct-2014::15:33:36 ===
connection <0.756.0>, channel 1 - soft error:
{amqp_error,precondition_failed,“unknown delivery tag 1172”,‘basic.ack’}

=ERROR REPORT==== 1-Oct-2014::15:33:36 ===
AMQP connection <0.756.0> (running), channel 1 - error:
{amqp_error,channel_error,“expected ‘channel.open’”,‘basic.recover’}

=ERROR REPORT==== 1-Oct-2014::15:33:38 ===
connection <0.775.0>, channel 1 - soft error:
{amqp_error,not_found,“no queue ‘keepalives’ in vhost ‘sensu’”,
‘queue.declare’}

=INFO REPORT==== 1-Oct-2014::15:33:39 ===
closing AMQP connection <0.775.0> (172.42.29.222:38763 -> 172.42.29.222:5671)

···

On Tue, Sep 30, 2014 at 2:53 PM, Bryan Brandau agent462@gmail.com wrote:

I’m monitoring a multi-billion dollar ecommerce site with it, so yes.

Do you have errors in the log when it exited?

Thanks,

Bryan


iThings4U GmbH
Tempelhofer Ufer 17 • 10963 Berlin • Germany
Managing Director: Harald Zapp
Commercial Register Number: 149071B
Tax ID: 37/503/21630
VAT ID: DE288362290

On Tue, Sep 30, 2014 at 6:43 AM, Pablo Relayr pablo@relayr.de wrote:

Hi,

Recently I installed Sensu to monitor a production datacenter, versions 0.13.0 and 0.13.1, and I have found already 3 times in the last 2 or 3 weeks that sensu-server and sensu-api had exited on their own.

Right now I am updating to 0.14.0, hoping that it will solve the problems, but I’m thinking if sensu is mature enough to be used in production?

Is there any misconfiguration problem that could lead to those processes to die?

Is anyone else monitoring a production environment with Sensu? Are you happy with it?

Thanks.