We’re setting up a sensu PoC system, and I’ve been having trouble finding any information regarding performance/load testing. The sensu documentation for redis mentions the existence of redis-benchmark, but doesn’t give any indications on how to model sensu’s access patterns. Eventually, we want to be running sensu and its dependencies in docker containers (in a scalable HA configuration), but we want a way to measure the effect this will have on sensu, redis, rabbitmq, etc’s performance. I’ve looked through the sensu-redis library, but there doesn’t seem to be much to indicate how the connection is configured. Is pipelining enabled? Is the connection kept alive?
Monitoring goals:
- Primarily monitor kubernetes containers, pods, etc., as well as some stand-alone docker hosts and software routers.
- Work with AWS, GCE, and bare metal.
- Scale across AZs and regions.
- Regions are tolerant of failures in other regions.
- Monitor 10,000 hosts per region.
I ended up creating 3 nodes in virtualbox (sensu/uchiwa, redis, and rabbitmq w/ ssl), and used redis-cli to try to observe sensu’s interactions with redis. All 3 nodes have been configured to run ‘metrics-cpu.rb’ & ‘check-process.rb -p cron’ checks to generate some simple test data. It would appear that sensu-server and sensu-api each create a single client connection to redis. I’m guessing we can ignore sensu-api (and redis sentinel in HA) connections as I assume they have a minor impact on redis’s performance compared to sensu-server (equating to the ‘-c 1’ switch in redis-benchmark). Should I increase this in a redis HA config to account for the slaves reading from the master? Also, the number of connections never changes, so I assume sensu-server & sensu-api are using keepalive (’-k 1’ in redis-benchmark). Using ‘redis-benchmark --bigkeys’, I get the following output:
$ redis-cli -h redis-0 -a REDIS-PASSWD --bigkeys
Scanning the entire keyspace to find biggest keys as well as
average sizes per key type. You can use -i 0.1 to sleep 0.1 sec
per 100 SCAN commands (not usually needed).
[00.00%] Biggest string found so far ‘client:rabbitmq-0’ with 208 bytes
[00.00%] Biggest string found so far ‘result:sensu-0:cpu_metrics’ with 228 bytes
[00.00%] Biggest list found so far ‘history:rabbitmq-0:keepalive’ with 21 items
[00.00%] Biggest set found so far ‘clients’ with 3 members
[31.25%] Biggest string found so far ‘result:sensu-0:cron’ with 248 bytes
-------- summary -------
Sampled 32 keys in the keyspace!
Total key length in bytes is 680 (avg len 21.25)
Biggest string found ‘result:sensu-0:cron’ has 248 bytes
Biggest list found ‘history:rabbitmq-0:keepalive’ has 21 items
Biggest set found ‘clients’ has 3 members
18 strings with 2827 bytes (56.25% of keys, avg size 157.06)
9 lists with 189 items (28.12% of keys, avg size 21.00)
5 sets with 13 members (15.62% of keys, avg size 2.60)
0 hashs with 0 fields (00.00% of keys, avg size 0.00)
0 zsets with 0 members (00.00% of keys, avg size 0.00)
I know this is a very small sample to be basing my testing config off of, but lacking any official guidance, I needed SOME starting point. Please correct me if I’m wrong, but based on the ‘avg size 157.06’ line, I’ve set the redis-benchmark value data size to 160 bytes (’-d 160’). I’m using the default number of transactions (10,000) with the keyspace randomization set to 1000 to mimic 1000 hosts/pods/containers/etc sending 10 metrics each. So my final redis-benchmark command is:
redis-benchmark -h redis-0 -a REDIS-PASSWD -q --csv -n 100000 -c 1 -r 10000 -d 160
I then loop this to execute 5 times from a benchmark node against the redis server for each of the following configurations:
- Debian 8 VM with redis 3.2.8 installed from jessie-backports.
- Redis 3.2.8 docker containers.
- 10GB LVM volume mounted for writing the AOF and dump.rdb to.
- With 2 slaves, and redis sentinels running on the master and each slave.
- Changing ‘-c 1’ to ‘-c 3’ to simulate scaling up to 3 sensu-server nodes.
- All the various combinations of 1 - 5.
Thoughts? Advise?
It would REALLY help if this was documented somewhere.
Ditto for rabbitmq perf testing.
Also, official sensu HA docs would be nice.
Also, I want a pony.
Gabriel Burkholder