cannot get rid of keepalive's

#1

Hi, I have keepalive’s configured on all my nodes on one Sensu server.

They used to be pointing at the other Sensu server.

My problem is that even though I’ve API deleted the nodes from the old Sensu server, they keep getting rediscovered and reported as keepalive failures.

Has anyone seen this?

#2

Can you confirm there is no shared rabbitmq or redis between the new and old?
Also when you did the move, is it possible there is some stale dns?
(most processes, sensu included, don't re-resolve and re-connect, you
have to restart sensu clients to get them to point to a new rabbitmq,
etc.)

···

On Tue, May 17, 2016 at 10:13 AM, Stuart Cracraft <smcracraft@gmail.com> wrote:

Hi, I have keepalive's configured on all my nodes on one Sensu server.

They used to be pointing at the other Sensu server.

My problem is that even though I've API deleted the nodes from the old Sensu
server, they keep getting rediscovered and reported as keepalive failures.

Has anyone seen this?

#3

No shared RabbitMQ. One in each server.

No shared Redis either. We use RabbitMQ.

Also restarted Sensu client after API node delete.

Bizarre.

···

On May 17, 2016, at 7:14 PM, Kyle Anderson <kyle@xkyle.com> wrote:

Can you confirm there is no shared rabbitmq or redis between the new and old?
Also when you did the move, is it possible there is some stale dns?
(most processes, sensu included, don't re-resolve and re-connect, you
have to restart sensu clients to get them to point to a new rabbitmq,
etc.)

On Tue, May 17, 2016 at 10:13 AM, Stuart Cracraft <smcracraft@gmail.com> wrote:
Hi, I have keepalive's configured on all my nodes on one Sensu server.

They used to be pointing at the other Sensu server.

My problem is that even though I've API deleted the nodes from the old Sensu
server, they keep getting rediscovered and reported as keepalive failures.

Has anyone seen this?

#4

Are you really sure you don't use redis? It is the only supported
datastore of Sensu.
https://sensuapp.org/docs/latest/data-store

I think it is very likely these two servers are sharing a redis.

Well keep in mind that the sensu client is pretty much always
re-registering itself, so if you restart the client *after* you delete
it from the api, it may re-register during that race.
You should probably restart the client first, allow it to connect to
the new rabbitmq, and *then* delete the client from the old server.

···

On Tue, May 17, 2016 at 7:19 PM, Stuart Cracraft <smcracraft@me.com> wrote:

No shared RabbitMQ. One in each server.

No shared Redis either. We use RabbitMQ.

Also restarted Sensu client after API node delete.

Bizarre.

On May 17, 2016, at 7:14 PM, Kyle Anderson <kyle@xkyle.com> wrote:

Can you confirm there is no shared rabbitmq or redis between the new and old?
Also when you did the move, is it possible there is some stale dns?
(most processes, sensu included, don't re-resolve and re-connect, you
have to restart sensu clients to get them to point to a new rabbitmq,
etc.)

On Tue, May 17, 2016 at 10:13 AM, Stuart Cracraft <smcracraft@gmail.com> wrote:
Hi, I have keepalive's configured on all my nodes on one Sensu server.

They used to be pointing at the other Sensu server.

My problem is that even though I've API deleted the nodes from the old Sensu
server, they keep getting rediscovered and reported as keepalive failures.

Has anyone seen this?

#5

Okay, thanks, I’ll look into it tomorrow.

Currently involved in, ironically, a completely separate non-Sensu Redis outage.

For now the Sensu keepalive’s are filtered and sequestered away from people.

···

On May 17, 2016, at 7:24 PM, Kyle Anderson <kyle@xkyle.com> wrote:

Are you really sure you don't use redis? It is the only supported
datastore of Sensu.
https://sensuapp.org/docs/latest/data-store

I think it is very likely these two servers are sharing a redis.

Well keep in mind that the sensu client is pretty much always
re-registering itself, so if you restart the client *after* you delete
it from the api, it may re-register during that race.
You should probably restart the client first, allow it to connect to
the new rabbitmq, and *then* delete the client from the old server.

On Tue, May 17, 2016 at 7:19 PM, Stuart Cracraft <smcracraft@me.com> wrote:

No shared RabbitMQ. One in each server.

No shared Redis either. We use RabbitMQ.

Also restarted Sensu client after API node delete.

Bizarre.

On May 17, 2016, at 7:14 PM, Kyle Anderson <kyle@xkyle.com> wrote:

Can you confirm there is no shared rabbitmq or redis between the new and old?
Also when you did the move, is it possible there is some stale dns?
(most processes, sensu included, don't re-resolve and re-connect, you
have to restart sensu clients to get them to point to a new rabbitmq,
etc.)

On Tue, May 17, 2016 at 10:13 AM, Stuart Cracraft <smcracraft@gmail.com> wrote:
Hi, I have keepalive's configured on all my nodes on one Sensu server.

They used to be pointing at the other Sensu server.

My problem is that even though I've API deleted the nodes from the old Sensu
server, they keep getting rediscovered and reported as keepalive failures.

Has anyone seen this?