Sensu classic vs Sensu go scalability

We have been running sensu classic using snssq-ng transport in aws eks/k8s. This setup has been very scalable and serves about 15k clients (with 10-15 checks) distributed in about 10 datacenters (each datacenter has one ore more api and server container pods). We use uchiwa to display events of all dataceneters. The whole setup consumes less than 1 cpu core in k8s. Our sensu clients are not real clients but lambda function who act like sensu clients. We monitor aws services like rds and other services using this setup.

Due to the deprecation of sensu v1 we took a look at sensu-go oss.

In our test setup we use an external etcd cluster (installed via the bitnami helm chart).

We run out lambda functions against the sensu-go api and create entities and events similar to our old solution.

The new setup does not handle more than 3k clients without problems like these:

I understand that sensu-go has a commercial offering which might scale better. But it looks like there are some general issues here. It would be great if sensu-go oss would scale similar to sensu classic.

1 Like