Sensu cluster: members not joining

3 nodes running Ubuntu 20.04. Sensu Go 6.5.4. I have opened the following ports: 2379, 2380, 3000, 8080, 8081

Process:

  • install sensu-go packages from DEB repo
  • touch /etc/sensu/backend.yml on all nodes
  • systemctl start sensu-backend on all nodes
  • sensu-backend init --ignore-already-initialized (with the env vars set) on all nodes
  • sensuctl configure -n --username ...... --url 'http://127.0.0.1:8080' on all nodes
  • fill backend.yml with content on all nodes (see below)
  • restart sensu-backend on all nodes

This is what I get on all nodes:

root@sensu-01:~# sensuctl cluster health
=== Etcd Cluster ID: 3b0efc7b379f89be
         ID            Name     Error   Healthy
─────────────────── ────────── ─────── ──────────
  8927110dc66458af   sensu-01           true
root@sensu-01:~# sensuctl cluster member-list
=== Etcd Cluster ID: 3b0efc7b379f89be
         ID            Name           Peer URLs              Client URLs
─────────────────── ────────── ─────────────────────── ─────────────────────────
  8927110dc66458af   sensu-01   http://127.0.0.1:2380   http://10.2.80.97:2379

backend.yml on sensu-01:

etcd-advertise-client-urls: "http://10.2.80.97:2379"
etcd-listen-client-urls: "http://10.2.80.97:2379"
etcd-listen-peer-urls: "http://0.0.0.0:2380"
etcd-initial-cluster: "sensu-01=http://10.2.80.97:2380,sensu-02=http://10.2.92.73:2380,sensu-03=http://10.2.99.134:2380"
etcd-initial-advertise-peer-urls: "http://10.2.80.97:2380"
etcd-initial-cluster-state: "new"
etcd-initial-cluster-token: "c120499b8eb96bad95463e86e57926d0"
etcd-name: "sensu-01"
log-level: "info"

on sensu-02:

etcd-advertise-client-urls: "http://10.2.92.73:2379"
etcd-listen-client-urls: "http://10.2.92.73:2379"
etcd-listen-peer-urls: "http://0.0.0.0:2380"
etcd-initial-cluster: "sensu-01=http://10.2.80.97:2380,sensu-02=http://10.2.92.73:2380,sensu-03=http://10.2.99.134:2380"
etcd-initial-advertise-peer-urls: "http://10.2.92.73:2380"
etcd-initial-cluster-state: "new"
etcd-initial-cluster-token: "c120499b8eb96bad95463e86e57926d0"
etcd-name: "sensu-02"
log-level: "info"

on sensu-03:

etcd-advertise-client-urls: "http://10.2.99.134:2379"
etcd-listen-client-urls: "http://10.2.99.134:2379"
etcd-listen-peer-urls: "http://0.0.0.0:2380"
etcd-initial-cluster: "sensu-01=http://10.2.80.97:2380,sensu-02=http://10.2.92.73:2380,sensu-03=http://10.2.99.134:2380"
etcd-initial-advertise-peer-urls: "http://10.2.99.134:2380"
etcd-initial-cluster-state: "new"
etcd-initial-cluster-token: "c120499b8eb96bad95463e86e57926d0"
etcd-name: "sensu-03"
log-level: "info"

If I curl sensu-02 from sensu-01 I get 404 not found:

root@sensu-01:~# curl http://10.2.92.73:2380
404 page not found

I’ve tried to manually add sensu-02 to sensu-01 but it broke the first node:

root@sensu-01:~# sensuctl cluster member-add sensu-02 http://10.2.92.73:2380
added member ff4715377217cdae to cluster

ETCD_NAME="sensu-02"
ETCD_INITIAL_CLUSTER="sensu-01=http://127.0.0.1:2380,sensu-02=http://10.2.92.73:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"
root@sensu-01:~# sensuctl cluster health
Error: result header was empty, etcd cluster may be down

What am I missing?

I changed the sensuctl configuration procedure to:

sensuctl configure -n --username ...... --url "http://10.2.80.97:8080" and reinstalled everything from scratch, but I still get:

root@sensu-01:~# sensuctl cluster member-list
=== Etcd Cluster ID: 3b0efc7b379f89be
         ID            Name           Peer URLs              Client URLs
─────────────────── ────────── ─────────────────────── ─────────────────────────
  8927110dc66458af   sensu-01   http://127.0.0.1:2380   http://10.2.80.97:2379

I am not sure why, but the Sensu backend appears to insist on listening on IPv6 addresses for some reason, even though I have no IPv6 enabled in the local network. That may explain why the cluster definition is broken.

But I’m still not sure how to fix it.

# cat /etc/sensu/backend.yml 
etcd-advertise-client-urls: "http://10.2.80.97:2379"
etcd-listen-client-urls: "http://0.0.0.0:2379"
etcd-listen-peer-urls: "http://0.0.0.0:2380"
etcd-initial-cluster: "sensu-01=http://10.2.80.97:2380,sensu-02=http://10.2.92.73:2380,sensu-03=http://10.2.99.134:2380"
etcd-initial-advertise-peer-urls: "http://10.2.80.97:2380"
etcd-initial-cluster-state: "new"
etcd-initial-cluster-token: "c120499b8eb96bad95463e86e57926d0"
etcd-name: "sensu-01"
log-level: "info"
# netstat -tlpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 10.2.80.97:8301         0.0.0.0:*               LISTEN      447/consul          
tcp        0      0 127.0.0.1:8500          0.0.0.0:*               LISTEN      447/consul          
tcp        0      0 127.0.0.1:53            0.0.0.0:*               LISTEN      462/named           
tcp        0      0 127.0.0.53:53           0.0.0.0:*               LISTEN      409/systemd-resolve 
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      650/sshd: /usr/sbin 
tcp        0      0 127.0.0.1:8600          0.0.0.0:*               LISTEN      447/consul          
tcp        0      0 127.0.0.1:953           0.0.0.0:*               LISTEN      462/named           
tcp6       0      0 :::2379                 :::*                    LISTEN      466/sensu-backend   
tcp6       0      0 :::2380                 :::*                    LISTEN      466/sensu-backend   
tcp6       0      0 :::8080                 :::*                    LISTEN      466/sensu-backend   
tcp6       0      0 :::8081                 :::*                    LISTEN      466/sensu-backend   
tcp6       0      0 :::22                   :::*                    LISTEN      650/sshd: /usr/sbin 
tcp6       0      0 :::3000                 :::*                    LISTEN      466/sensu-backend
# sensuctl cluster member-list
=== Etcd Cluster ID: 3b0efc7b379f89be
         ID            Name           Peer URLs              Client URLs        
─────────────────── ────────── ─────────────────────── ─────────────────────────
  8927110dc66458af   sensu-01   http://127.0.0.1:2380   http://10.2.80.97:2379
# sysctl -p | grep ipv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1

This is what I see if I start Sensu backend manually from the same config file:

# /usr/sbin/sensu-backend start -c /etc/sensu/backend.yml
{"component":"etcd","level":"info","caller":"embed/etcd.go:131","msg":"configuring peer listeners","listen-peer-urls":["http://0.0.0.0:2380"],"time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"embed/etcd.go:139","msg":"configuring client listeners","listen-client-urls":["http://10.2.80.97:2379"],"time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"embed/etcd.go:307","msg":"starting an etcd server","etcd-version":"3.5.0","git-sha":"Not provided (use ./build instead of go build)","go-version":"go1.17.1","go-os":"linux","go-arch":"amd64","max-cpu-set":2,"max-cpu-available":2,"member-initialized":true,"name":"sensu-01","data-dir":"/var/lib/sensu/sensu-backend/etcd/data","wal-dir":"/var/lib/sensu/sensu-backend/etcd/wal","wal-dir-dedicated":"/var/lib/sensu/sensu-backend/etcd/wal","member-dir":"/var/lib/sensu/sensu-backend/etcd/data/member","force-new-cluster":false,"heartbeat-interval":"100ms","election-timeout":"1s","initial-election-tick-advance":true,"snapshot-count":100000,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["http://10.2.80.97:2380"],"listen-peer-urls":["http://0.0.0.0:2380"],"advertise-client-urls":["http://10.2.80.97:2379"],"listen-client-urls":["http://10.2.80.97:2379"],"listen-metrics-urls":[],"cors":["*"],"host-whitelist":["*"],"initial-cluster":"","initial-cluster-state":"new","initial-cluster-token":"","quota-size-bytes":4294967296,"pre-vote":true,"initial-corrupt-check":false,"corrupt-check-time-interval":"0s","auto-compaction-mode":"revision","auto-compaction-retention":"2ns","auto-compaction-interval":"2ns","discovery-url":"","discovery-proxy":"","downgrade-check-interval":"5s","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"etcdserver/backend.go:81","msg":"opened backend db","path":"/var/lib/sensu/sensu-backend/etcd/data/member/snap/db","took":"259.405µs","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"etcdserver/server.go:526","msg":"No snapshot found. Recovering WAL from scratch!","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"etcdserver/raft.go:483","msg":"restarting local member","cluster-id":"3b0efc7b379f89be","local-member-id":"8927110dc66458af","commit-index":919,"time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8927110dc66458af switched to configuration voters=()","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8927110dc66458af became follower at term 7","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"newRaft 8927110dc66458af [peers: [], term: 7, commit: 919, applied: 0, lastindex: 919, lastterm: 7]","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"api/capability.go:75","msg":"enabled capabilities for version","cluster-version":"3.5","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"membership/cluster.go:276","msg":"recovered/added member from store","cluster-id":"3b0efc7b379f89be","local-member-id":"8927110dc66458af","recovered-remote-peer-id":"8927110dc66458af","recovered-remote-peer-urls":["http://127.0.0.1:2380"],"time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"membership/cluster.go:285","msg":"set cluster version from store","cluster-version":"3.5","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"warning","caller":"auth/store.go:1220","msg":"simple token is not cryptographically signed","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"mvcc/kvstore.go:345","msg":"restored last compact revision","meta-bucket-name":"meta","meta-bucket-name-key":"finishedCompactRev","restored-compact-revision":609,"time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"mvcc/kvstore.go:415","msg":"kvstore restored","current-rev":683,"time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"etcdserver/quota.go:117","msg":"enabled backend quota","quota-name":"v3-applier","quota-size-bytes":4294967296,"quota-size":"4.3 GB","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"etcdserver/server.go:834","msg":"starting etcd server","local-member-id":"8927110dc66458af","local-server-version":"3.5.0","cluster-id":"3b0efc7b379f89be","cluster-version":"3.5","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"etcdserver/server.go:728","msg":"started as single-node; fast-forwarding election ticks","local-member-id":"8927110dc66458af","forward-ticks":9,"forward-duration":"900ms","election-ticks":10,"election-timeout":"1s","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8927110dc66458af switched to configuration voters=(9882886658148554927)","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"membership/cluster.go:393","msg":"added member","cluster-id":"3b0efc7b379f89be","local-member-id":"8927110dc66458af","added-peer-id":"8927110dc66458af","added-peer-peer-urls":["http://127.0.0.1:2380"],"time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"membership/cluster.go:523","msg":"updated cluster version","cluster-id":"3b0efc7b379f89be","local-member-id":"8927110dc66458af","from":"3.5","to":"3.5","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"embed/etcd.go:276","msg":"now serving peer/client/metrics","local-member-id":"8927110dc66458af","initial-advertise-peer-urls":["http://10.2.80.97:2380"],"listen-peer-urls":["http://0.0.0.0:2380"],"advertise-client-urls":["http://10.2.80.97:2379"],"listen-client-urls":["http://10.2.80.97:2379"],"listen-metrics-urls":[],"time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"embed/etcd.go:580","msg":"serving peer traffic","address":"[::]:2380","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"embed/etcd.go:552","msg":"cmux::serve","address":"[::]:2380","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8927110dc66458af is starting a new election at term 7","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8927110dc66458af became pre-candidate at term 7","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8927110dc66458af received MsgPreVoteResp from 8927110dc66458af at term 7","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8927110dc66458af became candidate at term 8","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8927110dc66458af received MsgVoteResp from 8927110dc66458af at term 8","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8927110dc66458af became leader at term 8","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"raft.node: 8927110dc66458af elected leader 8927110dc66458af at term 8","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"etcdserver/server.go:2027","msg":"published local member to cluster through raft","local-member-id":"8927110dc66458af","local-member-attributes":"{Name:sensu-01 ClientURLs:[http://10.2.80.97:2379]}","request-path":"/0/members/8927110dc66458af/attributes","cluster-id":"3b0efc7b379f89be","publish-timeout":"7s","time":"2021-11-17T01:36:16Z"}
{"component":"sensu-etcd","level":"info","msg":"Etcd ready to serve client connections","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"embed/serve.go:98","msg":"ready to serve client requests","time":"2021-11-17T01:36:16Z"}
{"component":"etcd","level":"info","caller":"embed/serve.go:140","msg":"serving client traffic insecurely; this is strongly discouraged!","address":"10.2.80.97:2379","time":"2021-11-17T01:36:16Z"}
{"component":"backend","entity":{"entity_class":"backend","system":{"hostname":"sensu-01","os":"linux","platform":"ubuntu","platform_family":"debian","platform_version":"20.04","network":{"interfaces":[{"name":"lo","addresses":["127.0.0.1/8"]},{"name":"eth0","mac":"02:a0:b5:5f:64:99","addresses":["10.2.80.97/21"]}]},"arch":"amd64","libc_type":"glibc","vm_system":"xen","vm_role":"guest","cloud_provider":"","processes":null},"subscriptions":null,"last_seen":0,"deregister":false,"deregistration":{},"metadata":{"name":"sensu-01"},"sensu_agent_version":""},"level":"info","msg":"backend entity information","time":"2021-11-17T01:36:16Z"}
{"component":"pipelined","level":"warning","msg":"StoreTimeout not configured","time":"2021-11-17T01:36:16Z"}
{"component":"licensing","level":"info","msg":"starting the license watcher","time":"2021-11-17T01:36:16Z"}
{"component":"licensing","level":"info","msg":"no enterprise license found","time":"2021-11-17T01:36:16Z"}
{"component":"store-providers","level":"info","msg":"starting the event store providers watcher","time":"2021-11-17T01:36:16Z"}
{"component":"store-providers","level":"info","msg":"no event store provider found","time":"2021-11-17T01:36:16Z"}
{"component":"agentd","level":"warning","msg":"starting agentd on address: [::]:8081","time":"2021-11-17T01:36:16Z"}
{"component":"apid","level":"warning","msg":"starting apid on address: [::]:8080","time":"2021-11-17T01:36:16Z"}
{"component":"tessend","level":"info","msg":"tessen is opted in, enabling tessen.. thank you so much for your support 💚","opt-out":false,"time":"2021-11-17T01:36:16Z"}
{"component":"metricsd","level":"info","msg":"starting metricsd","time":"2021-11-17T01:36:16Z"}
{"component":"metricsd","level":"info","msg":"metricsd running","time":"2021-11-17T01:36:16Z"}
{"component":"auth-providers","level":"info","msg":"starting the authentication providers watcher","time":"2021-11-17T01:36:16Z"}
{"component":"auth-providers","level":"info","msg":"no authentication provider found","time":"2021-11-17T01:36:16Z"}
{"component":"secrets-providers","level":"info","msg":"starting the secrets providers watcher","time":"2021-11-17T01:36:16Z"}
{"component":"secrets-providers","level":"info","msg":"enabled secrets provider \"env\"","time":"2021-11-17T01:36:16Z"}
{"component":"web","level":"warning","msg":"starting webd on address: [::]:3000","time":"2021-11-17T01:36:16Z"}
{"component":"backend","level":"warning","msg":"backend is running and ready to accept events","time":"2021-11-17T01:36:16Z"}

Hi there,

I’ve looked over your configuration and all of that seems to jive with what I have in my own lab (a 3-node cluster with TLS all the way down). One thing that I didn’t see in your initial post is whether or not the data directory was completely wiped. What’s got me thinking about this is that some of your output indicates that sensu-01 seems to still be reporting that it’s listening on the local loopback address. So my suggestion would be to completely stop Sensu on all hosts, make sure that /var/lib/sensu/sensu-backend/etcd is completely wiped, and also make sure that your sensuctl configuration in ~/.config/sensu/sensuctl is also wiped and see if that makes a difference.

One other thing to note with regard to your post here:

This may explain what you’re seeing: Netstat shows tcp6 on ipv4 only host. So effectively, Sensu is listening on any IPv6 address and any IPv4 addresses that are mapped to IPv6. FWIW, my lap shows something similar, and I’m running a very similar configuration:

Hi Aaron,

I’ve thought about the stale files, so I actually have an Ansible playbook that literally removes the DEB packages and then rm -rf all directories that were listed in the DEB manifest. I use that to wipe the slate clean between attempts (and then the install procedure is also in Ansible, so I can iterate very quickly). So I’m reasonably sure that’s not the cause.

Re: the IPv6 issue, it may be a red herring after all. I can ssh to, say, sensu-02 and telnet sensu-01 on the Sensu ports, using IPv4, and I can open a TCP socket that way just fine - which means something is listening there. I do get HTTP error codes if I type GET / HTTP/1.0, but that’s probably normal.

Perhaps netstat is broken or something. I’m not sure.

Any other ideas for something else to try would be very welcome.

Solved it.

The order of operations is important. If you initialize the cluster members before they join, that’s what you get. This was not made very clear in the documentation (at least when I read it).

The correct order is:

  1. install Sensu DEB packages
  2. create the config file /etc/sensu/backend.yml with all the cluster setup information (peers, etc)
  3. start sensu-backend (the nodes will join the cluster)
  4. initialize the backends with sensu-backend init
  5. configure sensuctl

That’s it. Now it should be working.

netstat continues to show strange output, but I guess that’s a netstat bug.