Sensu client fails to connect to rabbitmq


#1

Hi,

We have hundreds of hosts running sensu just fine, but on one box, the sensu client fails to start, claiming it can’t connect to rabbitmq.

[root@testtprundeck01 sensu]# tail -7 /var/log/sensu/sensu-client.log
{“timestamp”:“2014-10-16T10:07:28.277643-0700”,“level”:“debug”,“message”:“scheduling standalone checks”}
{“timestamp”:“2014-10-16T10:07:28.277880-0700”,“level”:“debug”,“message”:“binding client tcp and udp sockets”,“options”:{“bind”:“127.0.0.1”,“port”:3030}}
{“timestamp”:“2014-10-16T10:07:28.280065-0700”,“level”:“warn”,“message”:“reconnecting to transport”}
{“timestamp”:“2014-10-16T10:07:28.280259-0700”,“level”:“fatal”,“message”:“transport connection error”,“error”:“failed to connect to rabbitmq”}
{“timestamp”:“2014-10-16T10:07:28.280339-0700”,“level”:“warn”,“message”:“stopping”}
{“timestamp”:“2014-10-16T10:07:28.280434-0700”,“level”:“info”,“message”:“completing checks in progress”,“checks_in_progress”:}
{“timestamp”:“2014-10-16T10:07:28.782482-0700”,“level”:“warn”,“message”:“stopping reactor”}

A lot of times when we see this, it just means we have an incorrect config value for our rabbitmq host. But that doesn’t seem to be the case this time. The host itself is able to connect to rabbitmq via telnet, so it doesn’t seem to be a problem with name resolution or net connectivity.

[root@testtprundeck01 sensu]# cat /etc/sensu/config.json
{
“rabbitmq”: {
“host”: “stgtpsensu01.foo.com”,
“port”: 5672,
“vhost”: “/sensu”,
“user”: “sensu”,
“password”: “password”,
“ssl”: {
“cert_chain_file”: “/etc/sensu/ssl/cert.pem”,
“private_key_file”: “/etc/sensu/ssl/key.pem”
}
},
“redis”: {
“host”: “stgtpsensu01.foo.com”,
“port”: 6379
},
“api”: {
“host”: “localhost”,
“bind”: “0.0.0.0”,
“port”: 4567
},
“dashboard”: {
“bind”: “0.0.0.0”,
“port”: 8080,
“user”: “admin”,
“password”: “secret”
}
}
[root@testtprundeck01 sensu]# telnet stgtpsensu01.foo.com 5672
Trying 192.168.50.21…
Connected to stgtpsensu01.foo.com (192.168.50.21).
Escape character is ‘^]’.
^]

Connection closed.
[root@testtprundeck01 sensu]#

I don’t know what to do with this. Any help is appreciated.

Thanks!

-Travis


#2

Do you see anything odd on rabbitmq logs? That’d be the place I’d look next.

···

On Thu, Oct 16, 2014 at 2:19 PM, Travis Bear travis.bear@gmail.com wrote:

Hi,

We have hundreds of hosts running sensu just fine, but on one box, the sensu client fails to start, claiming it can’t connect to rabbitmq.

[root@testtprundeck01 sensu]# tail -7 /var/log/sensu/sensu-client.log
{“timestamp”:“2014-10-16T10:07:28.277643-0700”,“level”:“debug”,“message”:“scheduling standalone checks”}
{“timestamp”:“2014-10-16T10:07:28.277880-0700”,“level”:“debug”,“message”:“binding client tcp and udp sockets”,“options”:{“bind”:“127.0.0.1”,“port”:3030}}
{“timestamp”:“2014-10-16T10:07:28.280065-0700”,“level”:“warn”,“message”:“reconnecting to transport”}
{“timestamp”:“2014-10-16T10:07:28.280259-0700”,“level”:“fatal”,“message”:“transport connection error”,“error”:“failed to connect to rabbitmq”}
{“timestamp”:“2014-10-16T10:07:28.280339-0700”,“level”:“warn”,“message”:“stopping”}
{“timestamp”:“2014-10-16T10:07:28.280434-0700”,“level”:“info”,“message”:“completing checks in progress”,“checks_in_progress”:}
{“timestamp”:“2014-10-16T10:07:28.782482-0700”,“level”:“warn”,“message”:“stopping reactor”}

A lot of times when we see this, it just means we have an incorrect config value for our rabbitmq host. But that doesn’t seem to be the case this time. The host itself is able to connect to rabbitmq via telnet, so it doesn’t seem to be a problem with name resolution or net connectivity.

[root@testtprundeck01 sensu]# cat /etc/sensu/config.json
{
“rabbitmq”: {
“host”: “stgtpsensu01.foo.com”,
“port”: 5672,
“vhost”: “/sensu”,
“user”: “sensu”,
“password”: “password”,
“ssl”: {
“cert_chain_file”: “/etc/sensu/ssl/cert.pem”,
“private_key_file”: “/etc/sensu/ssl/key.pem”
}
},
“redis”: {
“host”: “stgtpsensu01.foo.com”,
“port”: 6379
},
“api”: {
“host”: “localhost”,
“bind”: “0.0.0.0”,
“port”: 4567
},
“dashboard”: {
“bind”: “0.0.0.0”,
“port”: 8080,
“user”: “admin”,
“password”: “secret”
}
}
[root@testtprundeck01 sensu]# telnet stgtpsensu01.foo.com 5672
Trying 192.168.50.21…
Connected to stgtpsensu01.foo.com (192.168.50.21).
Escape character is ‘^]’.
^]
telnet> Connection closed.
[root@testtprundeck01 sensu]#

I don’t know what to do with this. Any help is appreciated.

Thanks!

-Travis


André Dieb Martins
andredieb.com


#3

You have good instincts, Andre.

The rabbitmq server is rejecting connections from this host.

=INFO REPORT==== 15-Oct-2014::18:29:20 ===
accepting AMQP connection <0.16296.0> (192.168.48.162:58640 -> 192.168.50.21:5672)

=ERROR REPORT==== 15-Oct-2014::18:29:20 ===
closing AMQP connection <0.16296.0> (192.168.48.162:58640 -> 192.168.50.21:5672):
{bad_header,<<129,15,1,3,3,0,246,0>>}

Are there common causes for this problem? I’m assuming based on the error that this is due to a bad client library, or an incorrect client library version.

Thanks!

-Travis

···

On Thursday, October 16, 2014 10:33:09 AM UTC-7, André Dieb Martins wrote:

Do you see anything odd on rabbitmq logs? That’d be the place I’d look next.

On Thu, Oct 16, 2014 at 2:19 PM, Travis Bear travi...@gmail.com wrote:

Hi,

We have hundreds of hosts running sensu just fine, but on one box, the sensu client fails to start, claiming it can’t connect to rabbitmq.

[root@testtprundeck01 sensu]# tail -7 /var/log/sensu/sensu-client.log
{“timestamp”:“2014-10-16T10:07:28.277643-0700”,“level”:“debug”,“message”:“scheduling standalone checks”}
{“timestamp”:“2014-10-16T10:07:28.277880-0700”,“level”:“debug”,“message”:“binding client tcp and udp sockets”,“options”:{“bind”:“127.0.0.1”,“port”:3030}}
{“timestamp”:“2014-10-16T10:07:28.280065-0700”,“level”:“warn”,“message”:“reconnecting to transport”}
{“timestamp”:“2014-10-16T10:07:28.280259-0700”,“level”:“fatal”,“message”:“transport connection error”,“error”:“failed to connect to rabbitmq”}
{“timestamp”:“2014-10-16T10:07:28.280339-0700”,“level”:“warn”,“message”:“stopping”}
{“timestamp”:“2014-10-16T10:07:28.280434-0700”,“level”:“info”,“message”:“completing checks in progress”,“checks_in_progress”:}
{“timestamp”:“2014-10-16T10:07:28.782482-0700”,“level”:“warn”,“message”:“stopping reactor”}

A lot of times when we see this, it just means we have an incorrect config value for our rabbitmq host. But that doesn’t seem to be the case this time. The host itself is able to connect to rabbitmq via telnet, so it doesn’t seem to be a problem with name resolution or net connectivity.

[root@testtprundeck01 sensu]# cat /etc/sensu/config.json
{
“rabbitmq”: {
“host”: “stgtpsensu01.foo.com”,
“port”: 5672,
“vhost”: “/sensu”,
“user”: “sensu”,
“password”: “password”,
“ssl”: {
“cert_chain_file”: “/etc/sensu/ssl/cert.pem”,
“private_key_file”: “/etc/sensu/ssl/key.pem”
}
},
“redis”: {
“host”: “stgtpsensu01.foo.com”,
“port”: 6379
},
“api”: {
“host”: “localhost”,
“bind”: “0.0.0.0”,
“port”: 4567
},
“dashboard”: {
“bind”: “0.0.0.0”,
“port”: 8080,
“user”: “admin”,
“password”: “secret”
}
}
[root@testtprundeck01 sensu]# telnet stgtpsensu01.foo.com 5672
Trying 192.168.50.21…
Connected to stgtpsensu01.foo.com (192.168.50.21).
Escape character is ‘^]’.
^]

Connection closed.
[root@testtprundeck01 sensu]#

I don’t know what to do with this. Any help is appreciated.

Thanks!

-Travis


André Dieb Martins
andredieb.com