Error starting etcd: multiple discovery or boot strap flags set

Hi, I cloned GitHub - betorvs/sensu-go-statefulset: Helm chart to deploy sensu-go Backend using statefulset in kubernetes (as-is) and successfully deployed Sensu Go v5.21.3 on a Ubuntu 20.04.01 system under MicroK8s.

When I modify the values.yaml file to use Sensu Go v6.0.0 or v6.2.0 I receive the following error:

== waiting for sensu-backend-sensu-go-statefulset-0:2379 to become available before running backend-init...
{"component":"sensu-enterprise","error":"error starting etcd: multiple discovery or bootstrap flags are set. Choose one of \"initial-cluster\", \"discovery\" or \"discovery-srv\"","level":"fatal","msg":"error executing sensu-backend","time":"2021-01-04T22:56:34Z"}

The problem appears to be raised with the following command:

sensu-backend start --etcd-name ${HOST_NAME}.sensu.sensu.svc.cluster.local \
--etcd-discovery-srv sensu.sensu.svc.cluster.local \
--etcd-initial-advertise-peer-urls http://${HOST_NAME}.sensu.sensu.svc.cluster.local:2380 \
--etcd-initial-cluster-token sensu --etcd-initial-cluster-state new \
--etcd-advertise-client-urls http://${HOST_NAME}.sensu.sensu.svc.cluster.local:2379 \
--etcd-listen-client-urls http://0.0.0.0:2379 --etcd-listen-peer-urls http://0.0.0.0:2380 \
--state-dir /var/lib/sensu/sensu-backend/${HOST_NAME} --log-level debug --debug \
--api-url http://sensu-api.example.local:8080

Reading through migration notes from 5.21.3 to 6.0.0 I did find anything that would change the above command.

I did find a similar problem for etcd: Multiple discovery flags error with valid config file · Issue #7516 · etcd-io/etcd · GitHub

Per the “hack fix” I tried inserting a blank --etcd-initial-cluster, also tried setting using the environment variable, but no luck.

Is there something I am missing in the above command for the migration?

I have performed addition testing

Using a fresh install of Ubuntu 20.4.1 LTS and Docker 20.10.2 I ran the following commands (based on Install Sensu - Sensu Docs )

$ docker run -v /var/lib/sensu:/var/lib/sensu \
-d --name sensu-backend \
-p 3000:3000 -p 8080:8080 -p 8081:8081 \
-e SENSU_BACKEND_CLUSTER_ADMIN_USERNAME=admin \
-e SENSU_BACKEND_CLUSTER_ADMIN_PASSWORD=admin \
sensu/sensu:6.2.0 \
sensu-backend start --state-dir /var/lib/sensu/sensu-backend --log-level debug \
--etcd-discovery-srv test

Tailing the container log I find the following error:

$ docker container logs --tail 100 sensu-backend
== waiting for a2992910a825:2379 to become available before running backend-init...
{"component":"sensu-enterprise","error":"error starting etcd: multiple discovery or bootstrap flags are set. Choose one of \"initial-cluster\", \"discovery\" or \"discovery-srv\"","level":"fatal","msg":"error executing sensu-backend","time":"2021-01-13T21:49:18Z"}

I should not see this error as I am only specifying the single flag etcd-discovery-srv (i.e. neither etcd-initial-cluster or etcd-discovery are specified).

Next, I tried different container versions in the run command above and found:

  • sensu/sensu:5.21.3 does not get the error
  • sensu/sensu:6.0.0 does get the error

Last, I tried install v6.2.0 directly onto Ubuntu 20.4.1 LTS as follows:

$ curl -s https://packagecloud.io/install/repositories/sensu/stable/script.deb.sh | sudo bash
$ sudo apt-get install sensu-go-backend

I then ran the following tests:

$ sensu-backend start --etcd-initial-cluster default=http://127.0.0.1:2380

No errors from this command as I am only specifying the single bootstrap flag (etcd-initial-cluster)

$ sensu-backend start --etcd-discovery-srv test.org

No errors from this command as I am only specifying the single discovery flag (etcd-discovery-srv)

$ sensu-backend start --etcd-initial-cluster default=http://127.0.0.1:2380 --etcd-discovery-srv test
{"component":"sensu-enterprise","error":"error starting etcd: multiple discovery or bootstrap flags are set. Choose one of \"initial-cluster\", \"discovery\" or \"discovery-srv\"","level":"fatal","msg":"error executing sensu-backend","time":"2021-01-14T13:44:16-07:00"}

The above error is expected as I am specifying multiple conflicting flags (etcd-initial-cluster and etcd-discovery-srv)

Given the above testing I believe I have shown there was a bug introduced into the Sensu Go v6+ etcd cluster member discovery logic.

Hey,

Just for clarity, because I’m not fully following your testing.
Is this a problem that impacts 5.x → 6.x updates only.
Or is this a problem that impacts 6.x?

Hey just more clarity…
reading over your testing again…
what you are saying is the docker image isn’t working as you expect, producing an error for an docker run invocation that should work. But the “direct on Ubuntu” innovations work as you expect?

Just trying to get clarity to narrow this down.

Hey!

for your docker invocation as a remediation can you try setting envvar SENSU_BACKEND_ETCD_INITIAL_CLUSTER to an empty string as a way to override the docker images imposed value.

Hi jspaleta,

Yes that is what I was trying to show. v5.21.3 does not have the problem at all. v6+ only has the problem when using the docker image.

I tried your suggestion, but still see the error

$ docker run -v /var/lib/sensu:/var/lib/sensu \
-d --name sensu-backend \
-p 3000:3000 -p 8080:8080 -p 8081:8081 \
-e SENSU_BACKEND_CLUSTER_ADMIN_USERNAME="admin" \
-e SENSU_BACKEND_CLUSTER_ADMIN_PASSWORD="admin" \
-e SENSU_BACKEND_ETCD_INITIAL_CLUSTER="" \
sensu/sensu:6.2.3 \
sensu-backend start --state-dir /var/lib/sensu/sensu-backend --log-level debug \
--etcd-discovery-srv test

$ docker container logs --tail 100 sensu-backend
== waiting for 3c7c760f780f:2379 to become available before running backend-init...
{"component":"sensu-enterprise","error":"error starting etcd: multiple discovery or bootstrap flags are set. Choose one of \"initial-cluster\", \"discovery\" or \"discovery-srv\"","level":"fatal","msg":"error executing sensu-backend","time":"2021-01-28T20:28:26Z"}

Please let me know if you have other suggestions.

I think this is a deficiency in the script logic in the official docker images preventing the etcd service discovery options from working. It appears that existing logic always sets SENSU_BACKEND_ETCD_INITIAL_CLUSTER envvar.

I think the mitigation for now is going to be custom docker image with updated script logic to fix the problem. I’d offer you an alternative script to try, but I’m not confident I’d get the changes correct. You probably know what you need to get the etcd discovery to work more than i do. So it might be best if you construct your own custom docker image and report back with the needed changes for us to integrate and test.

Here’s the procedure I’ve been following to extract the script from the official image and then generate a local custom image when i need to test a behavior change.

  1. pull the sensu/sensu:latest docker image
    docker pull sensu/sensu:latest
  2. create a container from the image
    docker create --name sensu_extract sensu/sensu:latest
  3. extract the file /opt/sensu/bin/entrypoint.sh (from the container)
    docker cp sensu_extract:/opt/sensu/bin/entrypoint.sh entrypoint.sh
  4. remove the container
    docker rm sensu_extract
  5. edit the extracted file
  6. construct new Dockerfile that uses sensu/sensu:latest with an additional COPY operation to place the editted file back into the container image
FROM sensu/sensu:latest

COPY entrypoint.sh /opt/sensu/bin/entrypoint.sh
  1. edit the extract entrypoint.sh file to update the logic.
  2. build a custom image containing the editted script using the local Dockerfile.
    docker build -t sensu-local -f Dockerfile ./
  3. repeat 7 and 8 until new logic works.

We’re working on making the official Dockerfiles public in a repository to make it easier for people to collaborate around exactly these sorts of things and propose changes. If we don’t have that public repository in place by the time you sort this out, I can take proposed script edits from you and shepard them into the existing private repo pull request process on your behalf.

I hope this helps.

Thanks for response jspaleta. I had also opened Github issue Multiple discovery or bootstrap flags error with valid configuration · Issue #4178 · sensu/sensu-go · GitHub on this. Calebhailey responded they are aware of the issue and offers a possible interim work around - which I will try when my workload frees up.

1 Like

Hi @jspaleta,

Your suspicions were correct.

Following your detailed instructions I was able to modify the v6.2.7 entrypoint.sh and get things working. Here is a diff of the change I made:

$ diff entrypoint.sh.orig entrypoint.sh
19c19,21
<     : ${SENSU_BACKEND_ETCD_INITIAL_CLUSTER:=default=http://${SENSU_HOSTNAME}:2380}
---
>     if ! echo " $*" | grep -q -e " \-\-etcd-discovery " -e " \-\-etcd-discovery-srv " -e " \-\-etcd-initial-cluster "; then
>         : ${SENSU_BACKEND_ETCD_DISCOVERY:-${SENSU_BACKEND_ETCD_DISCOVERY_SRV:-${SENSU_BACKEND_ETCD_INITIAL_CLUSTER:=default=http://${SENSU_HOSTNAME}:2380}}}
>     fi

I did some testing in docker and kubernetes and it’s working.

As offered, can you please take my proposed script edits and shepard them into the repo appropriately.

Thanks very much for the help!

1 Like