Remote sensu client not able to talk to RabbitMQ


#1

Hi all
I have read several post on this group regarding this issue but none were able to help me out. After trying for quite some time I seek your help.

I have a sensu server behind a Amazon ELB running in a private subnet in Amazon VPC. Sensu client is also running on this box and it shows up just fine in uchiwa. I have some boxes which are not in VPC and because of them I am using SSL. From all these non vpc boxes I am able to telnet to to my sensu server on port 5671, problem is that sensu client on these boxes are constantly logging

“{“timestamp”:“2015-10-22T05:34:14.922835+0000”,“level”:“error”,“message”:”[amqp] Detected TCP connection failure"}"

I used the same cert.pem and key.pem for these boxes as I have on the sensu-server box(running local client). Using openssl (http://www.rabbitmq.com/troubleshooting-ssl.html) I found that my connection is hanging and eventually getting timed out

"openssl s_client -msg -state -connect x.y.z.elb.amazonaws.com:8443 -ssl3 -cert cert.pem -key key.pem -CAfile cacert.pem

CONNECTED(00000003)

SSL_connect:before/connect initialization

SSL 3.0 Handshake [length 0077], ClientHello

01 00 00 73 03 00 ac 93 bf 7c 62 67 74 e7 12 7f

ec 81 29 90 79 91 03 3a b2 d0 66 70 7c 94 f6 dd

0c ca 7a 73 f9 c8 00 00 4c c0 14 c0 0a 00 39 00

38 00 88 00 87 c0 0f c0 05 00 35 00 84 c0 13 c0

09 00 33 00 32 c0 12 c0 08 00 9a 00 99 00 45 00

44 00 16 00 13 c0 0e c0 04 c0 0d c0 03 00 2f 00

96 00 41 00 0a 00 07 c0 11 c0 07 c0 0c c0 02 00

05 00 04 00 ff 01 00

SSL_connect:SSLv3 write client hello A

SSL_connect:failed in SSLv3 read server hello A

140290208601952:error:1409E0E5:SSL routines:SSL3_WRITE_BYTES:ssl handshake failure:s3_pkt.c:617:

···

no peer certificate available


No client certificate CA names sent


SSL handshake has read 0 bytes and written 0 bytes


New, (NONE), Cipher is (NONE)

Secure Renegotiation IS NOT supported

Compression: NONE

Expansion: NONE

SSL-Session:

Protocol : SSLv3

Cipher : 0000

Session-ID:

Session-ID-ctx:

Master-Key:

Key-Arg : None

Krb5 Principal: None

PSK identity: None

PSK identity hint: None

Start Time: 1445491533

Timeout : 7200 (sec)

Verify return code: 0 (ok)"

Versions I am using:

Sensu Version : 0.20.3

Erlang: R16B

CentOS 7

RabbitMQ: 3.3.5

rabbitmq.config

[

{rabbit, [

{ssl_listeners, [5671]},

{ssl_options, [{cacertfile,"/etc/rabbitmq/ssl/cacert.pem"},

{certfile,"/etc/rabbitmq/ssl/cert.pem"},

{keyfile,"/etc/rabbitmq/ssl/key.pem"},

{verify,verify_peer},

{fail_if_no_peer_cert,true}]}

]}

].

rabbitmq.json:

{

“rabbitmq”: {

“ssl”: {

“cert_chain_file”: “/etc/sensu/ssl/cert.pem”,

“private_key_file”: “/etc/sensu/ssl/key.pem”

},

“host”: “192.3.1.95”,

“port”: 5671,

“vhost”: “/sensu”,

“user”: “sensu”,

“password”: “REDACTED”

}

}

I am not sure why this is happening and whats the best way out of this. Help is appreciated please.

Thanks


#2

I'm pretty sure that this is normal ELB behavior, where the CNAME
changes and the sensu components "don't know" to re-resolve the dns
name.
The is probably the #1 problem with dns-based service discovery.

Your tests (telnet/ssl) work fine because you are resolving the ELB ip
"just in time", where all the other components have stale ips.

I work around this in my infrastructure by putting my sensu components
under a supervisor (upstart) and setting reconnect_on_failure: false,
so that
when things change sensu will simply exit, and then usptart will
restart it and it will re-resolve.

Also elbs have limited connection timeouts and are not really
well-suited for long-lived connections.

This has been discussed a bit before:
https://groups.google.com/forum/#!topic/sensu-users/rerGpPY3gVw

And in general running RabbitMQ behind in AWS + ELB has its own gotchas.

I would be interested to hear from others who have used this kind of
setup in production to see how else they do it.

···

On Wed, Oct 21, 2015 at 10:41 PM, <aiman@flipagram.com> wrote:

Hi all
I have read several post on this group regarding this issue but none were
able to help me out. After trying for quite some time I seek your help.

I have a sensu server behind a Amazon ELB running in a private subnet in
Amazon VPC. Sensu client is also running on this box and it shows up just
fine in uchiwa. I have some boxes which are not in VPC and because of them I
am using SSL. From all these non vpc boxes I am able to telnet to to my
sensu server on port 5671, problem is that sensu client on these boxes are
constantly logging

"{"timestamp":"2015-10-22T05:34:14.922835+0000","level":"error","message":"[amqp]
Detected TCP connection failure"}"

I used the same cert.pem and key.pem for these boxes as I have on the
sensu-server box(running local client). Using openssl
(http://www.rabbitmq.com/troubleshooting-ssl.html) I found that my
connection is hanging and eventually getting timed out

"openssl s_client -msg -state -connect x.y.z.elb.amazonaws.com:8443 -ssl3
-cert cert.pem -key key.pem -CAfile cacert.pem
CONNECTED(00000003)
SSL_connect:before/connect initialization

SSL 3.0 Handshake [length 0077], ClientHello

    01 00 00 73 03 00 ac 93 bf 7c 62 67 74 e7 12 7f
    ec 81 29 90 79 91 03 3a b2 d0 66 70 7c 94 f6 dd
    0c ca 7a 73 f9 c8 00 00 4c c0 14 c0 0a 00 39 00
    38 00 88 00 87 c0 0f c0 05 00 35 00 84 c0 13 c0
    09 00 33 00 32 c0 12 c0 08 00 9a 00 99 00 45 00
    44 00 16 00 13 c0 0e c0 04 c0 0d c0 03 00 2f 00
    96 00 41 00 0a 00 07 c0 11 c0 07 c0 0c c0 02 00
    05 00 04 00 ff 01 00
SSL_connect:SSLv3 write client hello A
SSL_connect:failed in SSLv3 read server hello A
140290208601952:error:1409E0E5:SSL routines:SSL3_WRITE_BYTES:ssl handshake
failure:s3_pkt.c:617:
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 0 bytes and written 0 bytes
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
SSL-Session:
    Protocol : SSLv3
    Cipher : 0000
    Session-ID:
    Session-ID-ctx:
    Master-Key:
    Key-Arg : None
    Krb5 Principal: None
    PSK identity: None
    PSK identity hint: None
    Start Time: 1445491533
    Timeout : 7200 (sec)
    Verify return code: 0 (ok)"

Versions I am using:

Sensu Version : 0.20.3
Erlang: R16B
CentOS 7
RabbitMQ: 3.3.5

rabbitmq.config
[
    {rabbit, [
    {ssl_listeners, [5671]},
    {ssl_options, [{cacertfile,"/etc/rabbitmq/ssl/cacert.pem"},
                   {certfile,"/etc/rabbitmq/ssl/cert.pem"},
                   {keyfile,"/etc/rabbitmq/ssl/key.pem"},
                   {verify,verify_peer},
                   {fail_if_no_peer_cert,true}]}
  ]}
].

rabbitmq.json:

{
  "rabbitmq": {
    "ssl": {
      "cert_chain_file": "/etc/sensu/ssl/cert.pem",
      "private_key_file": "/etc/sensu/ssl/key.pem"
    },
    "host": "192.3.1.95",
    "port": 5671,
    "vhost": "/sensu",
    "user": "sensu",
    "password": "REDACTED"
  }
}

I am not sure why this is happening and whats the best way out of this. Help
is appreciated please.

Thanks


#3

Thanks for the reply Kyle. I will certainly look in to this ELB+Rabbit issue.

···

On Thursday, October 22, 2015 at 7:55:45 AM UTC-7, Kyle Anderson wrote:

I’m pretty sure that this is normal ELB behavior, where the CNAME
changes and the sensu components “don’t know” to re-resolve the dns
name.
The is probably the #1 problem with dns-based service discovery.

Your tests (telnet/ssl) work fine because you are resolving the ELB ip
“just in time”, where all the other components have stale ips.

I work around this in my infrastructure by putting my sensu components
under a supervisor (upstart) and setting reconnect_on_failure: false,
so that
when things change sensu will simply exit, and then usptart will
restart it and it will re-resolve.

Also elbs have limited connection timeouts and are not really
well-suited for long-lived connections.

This has been discussed a bit before:
https://groups.google.com/forum/#!topic/sensu-users/rerGpPY3gVw

And in general running RabbitMQ behind in AWS + ELB has its own gotchas.

I would be interested to hear from others who have used this kind of
setup in production to see how else they do it.

On Wed, Oct 21, 2015 at 10:41 PM, ai...@flipagram.com wrote:

Hi all
I have read several post on this group regarding this issue but none were
able to help me out. After trying for quite some time I seek your help.

I have a sensu server behind a Amazon ELB running in a private subnet in
Amazon VPC. Sensu client is also running on this box and it shows up just
fine in uchiwa. I have some boxes which are not in VPC and because of them I
am using SSL. From all these non vpc boxes I am able to telnet to to my
sensu server on port 5671, problem is that sensu client on these boxes are
constantly logging

“{“timestamp”:“2015-10-22T05:34:14.922835+0000”,“level”:“error”,“message”:”[amqp]
Detected TCP connection failure"}"

I used the same cert.pem and key.pem for these boxes as I have on the
sensu-server box(running local client). Using openssl
(http://www.rabbitmq.com/troubleshooting-ssl.html) I found that my
connection is hanging and eventually getting timed out

"openssl s_client -msg -state -connect x.y.z.elb.amazonaws.com:8443 -ssl3
-cert cert.pem -key key.pem -CAfile cacert.pem
CONNECTED(00000003)
SSL_connect:before/connect initialization

SSL 3.0 Handshake [length 0077], ClientHello
01 00 00 73 03 00 ac 93 bf 7c 62 67 74 e7 12 7f
ec 81 29 90 79 91 03 3a b2 d0 66 70 7c 94 f6 dd
0c ca 7a 73 f9 c8 00 00 4c c0 14 c0 0a 00 39 00
38 00 88 00 87 c0 0f c0 05 00 35 00 84 c0 13 c0
09 00 33 00 32 c0 12 c0 08 00 9a 00 99 00 45 00
44 00 16 00 13 c0 0e c0 04 c0 0d c0 03 00 2f 00
96 00 41 00 0a 00 07 c0 11 c0 07 c0 0c c0 02 00
05 00 04 00 ff 01 00
SSL_connect:SSLv3 write client hello A
SSL_connect:failed in SSLv3 read server hello A
140290208601952:error:1409E0E5:SSL routines:SSL3_WRITE_BYTES:ssl handshake
failure:s3_pkt.c:617:


no peer certificate available

No client certificate CA names sent

SSL handshake has read 0 bytes and written 0 bytes

New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
SSL-Session:
Protocol : SSLv3
Cipher : 0000
Session-ID:
Session-ID-ctx:
Master-Key:
Key-Arg : None
Krb5 Principal: None
PSK identity: None
PSK identity hint: None
Start Time: 1445491533
Timeout : 7200 (sec)
Verify return code: 0 (ok)"

Versions I am using:

Sensu Version : 0.20.3
Erlang: R16B
CentOS 7
RabbitMQ: 3.3.5

rabbitmq.config
[
{rabbit, [
{ssl_listeners, [5671]},
{ssl_options, [{cacertfile,"/etc/rabbitmq/ssl/cacert.pem"},
{certfile,"/etc/rabbitmq/ssl/cert.pem"},
{keyfile,"/etc/rabbitmq/ssl/key.pem"},
{verify,verify_peer},
{fail_if_no_peer_cert,true}]}
]}
].

rabbitmq.json:

{
“rabbitmq”: {
“ssl”: {
“cert_chain_file”: “/etc/sensu/ssl/cert.pem”,
“private_key_file”: “/etc/sensu/ssl/key.pem”
},
“host”: “192.3.1.95”,
“port”: 5671,
“vhost”: “/sensu”,
“user”: “sensu”,
“password”: “REDACTED”
}
}

I am not sure why this is happening and whats the best way out of this. Help
is appreciated please.

Thanks


#4

Ok, so inorder to leave ELB out of the picture I linked one of the non VPC box to the VPC using VPC link. This now can now talk to sensu box via private ip. I provisioned a new sensu server and the local client runs just fine and can be seen in uchiwa too but when I started client on this remote box I still see the same error:

{“timestamp”:“2015-10-22T18:05:07.923417+0000”,“level”:“error”,“message”:"[amqp] Detected TCP connection failure"}

To rule out password issues, I made a curl call to sensu server on port 15672 /api/whoami with sensu:password and that worked(made sensu a management user already).

Rabbitmq.json on this client is as follows:

{

“rabbitmq”: {

“ssl”: {

“cert_chain_file”: “/etc/sensu/ssl/cert.pem”,

“private_key_file”: “/etc/sensu/ssl/key.pem”

},

“host”: “192.3.7.176”,

“port”: 5671,

“vhost”: “/sensu”,

“user”: “sensu”,

“password”:“something”

}

}

when I telnet from this client to sensu, rabbit is accepting the connection and is (as expected) timing out:

=INFO REPORT==== 22-Oct-2015::18:12:31 ===

accepting AMQP connection <0.548.0> (10.198.144.210:58639 -> 192.3.7.176:5671)

=INFO REPORT==== 22-Oct-2015::18:12:31 ===

accepting AMQP connection <0.548.0> (10.198.144.210:58639 -> 192.3.7.176:5671)

=ERROR REPORT==== 22-Oct-2015::18:12:36 ===

error on AMQP connection <0.548.0>:

{ssl_upgrade_error,timeout}

=ERROR REPORT==== 22-Oct-2015::18:12:36 ===

error on AMQP connection <0.548.0>:

{ssl_upgrade_error,timeout}

Any other ideas here as to why my remote client is not talking to rabbit.

Thanks

···

On Thursday, October 22, 2015 at 7:55:45 AM UTC-7, Kyle Anderson wrote:

I’m pretty sure that this is normal ELB behavior, where the CNAME
changes and the sensu components “don’t know” to re-resolve the dns
name.
The is probably the #1 problem with dns-based service discovery.

Your tests (telnet/ssl) work fine because you are resolving the ELB ip
“just in time”, where all the other components have stale ips.

I work around this in my infrastructure by putting my sensu components
under a supervisor (upstart) and setting reconnect_on_failure: false,
so that
when things change sensu will simply exit, and then usptart will
restart it and it will re-resolve.

Also elbs have limited connection timeouts and are not really
well-suited for long-lived connections.

This has been discussed a bit before:
https://groups.google.com/forum/#!topic/sensu-users/rerGpPY3gVw

And in general running RabbitMQ behind in AWS + ELB has its own gotchas.

I would be interested to hear from others who have used this kind of
setup in production to see how else they do it.

On Wed, Oct 21, 2015 at 10:41 PM, ai...@flipagram.com wrote:

Hi all
I have read several post on this group regarding this issue but none were
able to help me out. After trying for quite some time I seek your help.

I have a sensu server behind a Amazon ELB running in a private subnet in
Amazon VPC. Sensu client is also running on this box and it shows up just
fine in uchiwa. I have some boxes which are not in VPC and because of them I
am using SSL. From all these non vpc boxes I am able to telnet to to my
sensu server on port 5671, problem is that sensu client on these boxes are
constantly logging

“{“timestamp”:“2015-10-22T05:34:14.922835+0000”,“level”:“error”,“message”:”[amqp]
Detected TCP connection failure"}"

I used the same cert.pem and key.pem for these boxes as I have on the
sensu-server box(running local client). Using openssl
(http://www.rabbitmq.com/troubleshooting-ssl.html) I found that my
connection is hanging and eventually getting timed out

"openssl s_client -msg -state -connect x.y.z.elb.amazonaws.com:8443 -ssl3
-cert cert.pem -key key.pem -CAfile cacert.pem
CONNECTED(00000003)
SSL_connect:before/connect initialization

SSL 3.0 Handshake [length 0077], ClientHello
01 00 00 73 03 00 ac 93 bf 7c 62 67 74 e7 12 7f
ec 81 29 90 79 91 03 3a b2 d0 66 70 7c 94 f6 dd
0c ca 7a 73 f9 c8 00 00 4c c0 14 c0 0a 00 39 00
38 00 88 00 87 c0 0f c0 05 00 35 00 84 c0 13 c0
09 00 33 00 32 c0 12 c0 08 00 9a 00 99 00 45 00
44 00 16 00 13 c0 0e c0 04 c0 0d c0 03 00 2f 00
96 00 41 00 0a 00 07 c0 11 c0 07 c0 0c c0 02 00
05 00 04 00 ff 01 00
SSL_connect:SSLv3 write client hello A
SSL_connect:failed in SSLv3 read server hello A
140290208601952:error:1409E0E5:SSL routines:SSL3_WRITE_BYTES:ssl handshake
failure:s3_pkt.c:617:


no peer certificate available

No client certificate CA names sent

SSL handshake has read 0 bytes and written 0 bytes

New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
SSL-Session:
Protocol : SSLv3
Cipher : 0000
Session-ID:
Session-ID-ctx:
Master-Key:
Key-Arg : None
Krb5 Principal: None
PSK identity: None
PSK identity hint: None
Start Time: 1445491533
Timeout : 7200 (sec)
Verify return code: 0 (ok)"

Versions I am using:

Sensu Version : 0.20.3
Erlang: R16B
CentOS 7
RabbitMQ: 3.3.5

rabbitmq.config
[
{rabbit, [
{ssl_listeners, [5671]},
{ssl_options, [{cacertfile,"/etc/rabbitmq/ssl/cacert.pem"},
{certfile,"/etc/rabbitmq/ssl/cert.pem"},
{keyfile,"/etc/rabbitmq/ssl/key.pem"},
{verify,verify_peer},
{fail_if_no_peer_cert,true}]}
]}
].

rabbitmq.json:

{
“rabbitmq”: {
“ssl”: {
“cert_chain_file”: “/etc/sensu/ssl/cert.pem”,
“private_key_file”: “/etc/sensu/ssl/key.pem”
},
“host”: “192.3.1.95”,
“port”: 5671,
“vhost”: “/sensu”,
“user”: “sensu”,
“password”: “REDACTED”
}
}

I am not sure why this is happening and whats the best way out of this. Help
is appreciated please.

Thanks


#5

UPDATE: using open_ssl to establish connection works, so certainly removing ELB from the picture helped but I think I have more than one communication issue here. Help is appreciated.

···

On Thursday, October 22, 2015 at 11:13:58 AM UTC-7, ai...@flipagram.com wrote:

Ok, so inorder to leave ELB out of the picture I linked one of the non VPC box to the VPC using VPC link. This now can now talk to sensu box via private ip. I provisioned a new sensu server and the local client runs just fine and can be seen in uchiwa too but when I started client on this remote box I still see the same error:

{“timestamp”:“2015-10-22T18:05:07.923417+0000”,“level”:“error”,“message”:"[amqp] Detected TCP connection failure"}

To rule out password issues, I made a curl call to sensu server on port 15672 /api/whoami with sensu:password and that worked(made sensu a management user already).

Rabbitmq.json on this client is as follows:

{

“rabbitmq”: {

“ssl”: {

“cert_chain_file”: “/etc/sensu/ssl/cert.pem”,

“private_key_file”: “/etc/sensu/ssl/key.pem”

},

“host”: “192.3.7.176”,

“port”: 5671,

“vhost”: “/sensu”,

“user”: “sensu”,

“password”:“something”

}

}

when I telnet from this client to sensu, rabbit is accepting the connection and is (as expected) timing out:

=INFO REPORT==== 22-Oct-2015::18:12:31 ===

accepting AMQP connection <0.548.0> (10.198.144.210:58639 -> 192.3.7.176:5671)

=INFO REPORT==== 22-Oct-2015::18:12:31 ===

accepting AMQP connection <0.548.0> (10.198.144.210:58639 -> 192.3.7.176:5671)

=ERROR REPORT==== 22-Oct-2015::18:12:36 ===

error on AMQP connection <0.548.0>:

{ssl_upgrade_error,timeout}

=ERROR REPORT==== 22-Oct-2015::18:12:36 ===

error on AMQP connection <0.548.0>:

{ssl_upgrade_error,timeout}

Any other ideas here as to why my remote client is not talking to rabbit.

Thanks

On Thursday, October 22, 2015 at 7:55:45 AM UTC-7, Kyle Anderson wrote:

I’m pretty sure that this is normal ELB behavior, where the CNAME
changes and the sensu components “don’t know” to re-resolve the dns
name.
The is probably the #1 problem with dns-based service discovery.

Your tests (telnet/ssl) work fine because you are resolving the ELB ip
“just in time”, where all the other components have stale ips.

I work around this in my infrastructure by putting my sensu components
under a supervisor (upstart) and setting reconnect_on_failure: false,
so that
when things change sensu will simply exit, and then usptart will
restart it and it will re-resolve.

Also elbs have limited connection timeouts and are not really
well-suited for long-lived connections.

This has been discussed a bit before:
https://groups.google.com/forum/#!topic/sensu-users/rerGpPY3gVw

And in general running RabbitMQ behind in AWS + ELB has its own gotchas.

I would be interested to hear from others who have used this kind of
setup in production to see how else they do it.

On Wed, Oct 21, 2015 at 10:41 PM, ai...@flipagram.com wrote:

Hi all
I have read several post on this group regarding this issue but none were
able to help me out. After trying for quite some time I seek your help.

I have a sensu server behind a Amazon ELB running in a private subnet in
Amazon VPC. Sensu client is also running on this box and it shows up just
fine in uchiwa. I have some boxes which are not in VPC and because of them I
am using SSL. From all these non vpc boxes I am able to telnet to to my
sensu server on port 5671, problem is that sensu client on these boxes are
constantly logging

“{“timestamp”:“2015-10-22T05:34:14.922835+0000”,“level”:“error”,“message”:”[amqp]
Detected TCP connection failure"}"

I used the same cert.pem and key.pem for these boxes as I have on the
sensu-server box(running local client). Using openssl
(http://www.rabbitmq.com/troubleshooting-ssl.html) I found that my
connection is hanging and eventually getting timed out

"openssl s_client -msg -state -connect x.y.z.elb.amazonaws.com:8443 -ssl3
-cert cert.pem -key key.pem -CAfile cacert.pem
CONNECTED(00000003)
SSL_connect:before/connect initialization

SSL 3.0 Handshake [length 0077], ClientHello
01 00 00 73 03 00 ac 93 bf 7c 62 67 74 e7 12 7f
ec 81 29 90 79 91 03 3a b2 d0 66 70 7c 94 f6 dd
0c ca 7a 73 f9 c8 00 00 4c c0 14 c0 0a 00 39 00
38 00 88 00 87 c0 0f c0 05 00 35 00 84 c0 13 c0
09 00 33 00 32 c0 12 c0 08 00 9a 00 99 00 45 00
44 00 16 00 13 c0 0e c0 04 c0 0d c0 03 00 2f 00
96 00 41 00 0a 00 07 c0 11 c0 07 c0 0c c0 02 00
05 00 04 00 ff 01 00
SSL_connect:SSLv3 write client hello A
SSL_connect:failed in SSLv3 read server hello A
140290208601952:error:1409E0E5:SSL routines:SSL3_WRITE_BYTES:ssl handshake
failure:s3_pkt.c:617:


no peer certificate available

No client certificate CA names sent

SSL handshake has read 0 bytes and written 0 bytes

New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
SSL-Session:
Protocol : SSLv3
Cipher : 0000
Session-ID:
Session-ID-ctx:
Master-Key:
Key-Arg : None
Krb5 Principal: None
PSK identity: None
PSK identity hint: None
Start Time: 1445491533
Timeout : 7200 (sec)
Verify return code: 0 (ok)"

Versions I am using:

Sensu Version : 0.20.3
Erlang: R16B
CentOS 7
RabbitMQ: 3.3.5

rabbitmq.config
[
{rabbit, [
{ssl_listeners, [5671]},
{ssl_options, [{cacertfile,"/etc/rabbitmq/ssl/cacert.pem"},
{certfile,"/etc/rabbitmq/ssl/cert.pem"},
{keyfile,"/etc/rabbitmq/ssl/key.pem"},
{verify,verify_peer},
{fail_if_no_peer_cert,true}]}
]}
].

rabbitmq.json:

{
“rabbitmq”: {
“ssl”: {
“cert_chain_file”: “/etc/sensu/ssl/cert.pem”,
“private_key_file”: “/etc/sensu/ssl/key.pem”
},
“host”: “192.3.1.95”,
“port”: 5671,
“vhost”: “/sensu”,
“user”: “sensu”,
“password”: “REDACTED”
}
}

I am not sure why this is happening and whats the best way out of this. Help
is appreciated please.

Thanks


#6

Several things worth trying:

  1. without ssl

  2. check rmq Vhost name (forward slash isn’t necessary)

  3. erlang version - see rabbitmq site for details about erlang version and ssl

Apologies for brevity, sending this from my phone,

Cheers,

Rob

···

On 22 Oct 2015, at 19:25, aiman@flipagram.com wrote:

UPDATE: using open_ssl to establish connection works, so certainly removing ELB from the picture helped but I think I have more than one communication issue here. Help is appreciated.

On Thursday, October 22, 2015 at 11:13:58 AM UTC-7, ai…@flipagram.com wrote:

Ok, so inorder to leave ELB out of the picture I linked one of the non VPC box to the VPC using VPC link. This now can now talk to sensu box via private ip. I provisioned a new sensu server and the local client runs just fine and can be seen in uchiwa too but when I started client on this remote box I still see the same error:

{“timestamp”:“2015-10-22T18:05:07.923417+0000”,“level”:“error”,“message”:"[amqp] Detected TCP connection failure"}

To rule out password issues, I made a curl call to sensu server on port 15672 /api/whoami with sensu:password and that worked(made sensu a management user already).

Rabbitmq.json on this client is as follows:

{

“rabbitmq”: {

“ssl”: {

“cert_chain_file”: “/etc/sensu/ssl/cert.pem”,

“private_key_file”: “/etc/sensu/ssl/key.pem”

},

“host”: “192.3.7.176”,

“port”: 5671,

“vhost”: “/sensu”,

“user”: “sensu”,

“password”:“something”

}

}

when I telnet from this client to sensu, rabbit is accepting the connection and is (as expected) timing out:

=INFO REPORT==== 22-Oct-2015::18:12:31 ===

accepting AMQP connection <0.548.0> (10.198.144.210:58639 -> 192.3.7.176:5671)

=INFO REPORT==== 22-Oct-2015::18:12:31 ===

accepting AMQP connection <0.548.0> (10.198.144.210:58639 -> 192.3.7.176:5671)

=ERROR REPORT==== 22-Oct-2015::18:12:36 ===

error on AMQP connection <0.548.0>:

{ssl_upgrade_error,timeout}

=ERROR REPORT==== 22-Oct-2015::18:12:36 ===

error on AMQP connection <0.548.0>:

{ssl_upgrade_error,timeout}

Any other ideas here as to why my remote client is not talking to rabbit.

Thanks

On Thursday, October 22, 2015 at 7:55:45 AM UTC-7, Kyle Anderson wrote:

I’m pretty sure that this is normal ELB behavior, where the CNAME
changes and the sensu components “don’t know” to re-resolve the dns
name.
The is probably the #1 problem with dns-based service discovery.

Your tests (telnet/ssl) work fine because you are resolving the ELB ip
“just in time”, where all the other components have stale ips.

I work around this in my infrastructure by putting my sensu components
under a supervisor (upstart) and setting reconnect_on_failure: false,
so that
when things change sensu will simply exit, and then usptart will
restart it and it will re-resolve.

Also elbs have limited connection timeouts and are not really
well-suited for long-lived connections.

This has been discussed a bit before:
https://groups.google.com/forum/#!topic/sensu-users/rerGpPY3gVw

And in general running RabbitMQ behind in AWS + ELB has its own gotchas.

I would be interested to hear from others who have used this kind of
setup in production to see how else they do it.

On Wed, Oct 21, 2015 at 10:41 PM, ai...@flipagram.com wrote:

Hi all
I have read several post on this group regarding this issue but none were
able to help me out. After trying for quite some time I seek your help.

I have a sensu server behind a Amazon ELB running in a private subnet in
Amazon VPC. Sensu client is also running on this box and it shows up just
fine in uchiwa. I have some boxes which are not in VPC and because of them I
am using SSL. From all these non vpc boxes I am able to telnet to to my
sensu server on port 5671, problem is that sensu client on these boxes are
constantly logging

“{“timestamp”:“2015-10-22T05:34:14.922835+0000”,“level”:“error”,“message”:”[amqp]
Detected TCP connection failure"}"

I used the same cert.pem and key.pem for these boxes as I have on the
sensu-server box(running local client). Using openssl
(http://www.rabbitmq.com/troubleshooting-ssl.html) I found that my
connection is hanging and eventually getting timed out

"openssl s_client -msg -state -connect x.y.z.elb.amazonaws.com:8443 -ssl3
-cert cert.pem -key key.pem -CAfile cacert.pem
CONNECTED(00000003)
SSL_connect:before/connect initialization

SSL 3.0 Handshake [length 0077], ClientHello
01 00 00 73 03 00 ac 93 bf 7c 62 67 74 e7 12 7f
ec 81 29 90 79 91 03 3a b2 d0 66 70 7c 94 f6 dd
0c ca 7a 73 f9 c8 00 00 4c c0 14 c0 0a 00 39 00
38 00 88 00 87 c0 0f c0 05 00 35 00 84 c0 13 c0
09 00 33 00 32 c0 12 c0 08 00 9a 00 99 00 45 00
44 00 16 00 13 c0 0e c0 04 c0 0d c0 03 00 2f 00
96 00 41 00 0a 00 07 c0 11 c0 07 c0 0c c0 02 00
05 00 04 00 ff 01 00
SSL_connect:SSLv3 write client hello A
SSL_connect:failed in SSLv3 read server hello A
140290208601952:error:1409E0E5:SSL routines:SSL3_WRITE_BYTES:ssl handshake
failure:s3_pkt.c:617:


no peer certificate available

No client certificate CA names sent

SSL handshake has read 0 bytes and written 0 bytes

New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
SSL-Session:
Protocol : SSLv3
Cipher : 0000
Session-ID:
Session-ID-ctx:
Master-Key:
Key-Arg : None
Krb5 Principal: None
PSK identity: None
PSK identity hint: None
Start Time: 1445491533
Timeout : 7200 (sec)
Verify return code: 0 (ok)"

Versions I am using:

Sensu Version : 0.20.3
Erlang: R16B
CentOS 7
RabbitMQ: 3.3.5

rabbitmq.config
[
{rabbit, [
{ssl_listeners, [5671]},
{ssl_options, [{cacertfile,"/etc/rabbitmq/ssl/cacert.pem"},
{certfile,"/etc/rabbitmq/ssl/cert.pem"},
{keyfile,"/etc/rabbitmq/ssl/key.pem"},
{verify,verify_peer},
{fail_if_no_peer_cert,true}]}
]}
].

rabbitmq.json:

{
“rabbitmq”: {
“ssl”: {
“cert_chain_file”: “/etc/sensu/ssl/cert.pem”,
“private_key_file”: “/etc/sensu/ssl/key.pem”
},
“host”: “192.3.1.95”,
“port”: 5671,
“vhost”: “/sensu”,
“user”: “sensu”,
“password”: “REDACTED”
}
}

I am not sure why this is happening and whats the best way out of this. Help
is appreciated please.

Thanks