Very new to sensu & sensu-go.
Environment: 3 node cluster running Ubuntu 20.04.5 LTS and sensu-go-backend 6.7.4
I have followed the instructions in the sensu docs to create a CA and generate certificates for the 3 nodes.
The certificates have been installed and the backup.yml files modified.
When I start the backends they don’t report any issues with the configuration.
However, the journal starts to fill up with the same warning
"tls: first record does not look like a TLS handshake"
I did a packet capture of the traffic between two of the nodes and this is where it gets really weird.
Looking at the traffic in Wireshark, I can see no attempt to start a TLS handshake but I can see lots of HTTP GET requests for /raft/stream/message and /raft/stream/msgapp objects.
I tried stopping the backends, clearing out the state-dir and restarting but that made no difference.
If I comment out all the tls stuff and change https to http, all the backends start up without a problem.
I have no idea what’s going on here and any help will be much appreciated.
Solved.
I originally configured the cluster to use HTTP.
When I reconfigured it to use HTTPS I was unaware that the etcd component was holding state somewhere so it was trying to use HTTP to communicate with backends expecting HTTPS.
I rebuilt the cluster from scratch using HTTPS and it is now working as expected.
1 Like
Hey there @Michael_Kelly , glad to hear you solved it. Etcd does indeed hold the configuration for TLS, so if the members start off unsecured, Etcd stores them as such.
Hi @aaronsachs, thanks for your reply.
Where does etcd store that configuration?
I’m wondering if it is possible to avoid having to rebuild an existing sensu-go cluster (currently using HTTP).
The cluster I created was a test to see if we could use certificates generated by puppet but it looks like we can’t. All I see are ‘bad certificate’ errors.
Etcd stores the data in whatever you specify as the data directory in your backend.yml
. By default, that’s /var/lib/sensu/sensu-backend/etcd
. The challenge is accessing the data directly–you’ll need to install etcdctl
to access the data. The cluster members can be viewed using etcdctl member list
, and you may be able to update the members using etcdctl member update
, but it’s often quicker to rebuild.