Isolating checks

We have a bunch of monitoring checks that check things external to the machine: query this metrics database to see if this value is under a threshold, curl this server to see if it’s accessible, etc. Currently, we’re handling these as round-robin checks on a set of machines dedicated to running all of these.

One of the problems we have is that the checks often create dependency hell between them. One check will require a certain version of a gem, another will require a different version, and we have to spend a lot of time trying to upgrade things so that they’re all happy. These checks don’t actually need to run at the same time, or communicate with each other, though - they’re independent scripts that just happen to run on the same machine. This got me thinking that there might be a way to isolate them from each other.

Docker is the obvious solution these days, but a) I don’t think there’s a way for Sensu to orchestrate booting up containers for checks and b) that seems very heavy-weight for checks that run in under a second (I imagine most of the time would be spent provisioning the containers). I suppose that I could try using rvm/bundler/rbenv/etc., but it seems like that might be a lot of work to get going with Sensu.

Does anyone have any experience with this sort of thing, or have alternate approaches to solving this problem?

Hi James,

How are you installing the plugins? In Sensu, you can use sensu-install -p $PLUGINNAME, and that will generally install everything you need using Sensu’s embedded Ruby. Typically, the only dependency that you might run across using this method is installing developer tools. Can you elaborate more on your installation process and what sort of dependencies you’re running into?

We have an internal gem that includes all our custom checks, and we install that gem using sensu-install. The conflicts come between the different checks we’ve written. For example, I might write a check that uses https://github.com/aws/aws-sdk-ruby in order to access the CloudTrail api. Then, a year or two later, I want to write a check that needs to access DynamoDB, but the methods I want to use aren’t available in the old version of the aws gem that we’re already using; to write this new check then I need to update the version of aws-sdk, but that breaks compatibility with the CloudTrail check and so I have to go update it. Often this gets even worse because updating the top-level gem (in this example, aws-sdk) requires updating some transitive dependency that’s also used by another gem in another check, and so the updates cascade and multiply and I find myself needing to update a dozen checks just to add one new one.

But none of these checks actually interact with each other, and so it should be perfectly fine for the CloudTrail check to use one version of aws-sdk and the DynamoDB check to use another; the problem comes in actually creating this isolation.

We have a few dozen custom checks right now, and I see it expanding dramatically the next few years, which is why I’m worried about this problem.

Offhand, I’m not aware of a solution for this sort of issue. Let me check with folks internally to see if anyone has any other ideas.

I think this is the nature of software dependencies and is not specific to ruby, gems, or sensu. You could try running multiple ruby runtimes be it with containerization (be in docker, lxc, etc) or or not as long as you handle your pathing properly.

Specifically in the case of aws you can have aws-sdk-[1-3] all loaded within the same ruby runtime and let each script/program tell you which version it needs to load. Not all gems are designed in such a way that makes your life easier and is not without it’s drawbacks. Between pinning your versions that are installed and pinning the version of the library you are consuming in your script/program I think you should be in pretty decent shape.

I have been consuming and maintaining the community plugins + custom ones for years with only a handful of times where I have run into issues similar to these. Most of the time it was not too much effort to work through updating them. The community plugins follow semver very strictly and we do not remove versions from rubygems (except in extreme security situations) so you can pin on old software for a very long time.

majormoses

    November 6

I think this is the nature of software dependencies and is not specific to ruby, gems, or sensu. You could try running multiple ruby runtimes be it with containerization (be in docker, lxc, etc) or or not as long as you handle your pathing properly.

Right, it’s a very classic problem. It’s a bit of a unique situation with sensu however in that most of the solutions aren’t designed for extremely short runtimes (I mentioned in my original post how containers, for instance, would be a poor solution in that you’d spend say three minutes provisioning and decomissioning the container and only a fraction of a second running it). I will probably end up trying to hack together something with rbenv, but was hoping to find some prior art that addresses the particular needs of sensu checks.

I have been consuming and maintaining the community plugins + custom ones for years with only a handful of times where I have run into issues similar to these. Most of the time it was not too much effort to work through updating them.

I am happy for you that you do not have to deal with this problem as much as I do, but that is not actually very helpful to me since thanks to our particular dependencies I do run into this quite frequently and it’s usually a significant amount of work to update.

Hmm taking ~3 minutes to spin up a container seems a bit long are you using a generic ruby container and then installing the gems during runtime? If so I think you could mitigate this by making specific containers that did all the installs in the build stage rather than at runtime. You would have to pay a cost when you update to download. Otherwise I think you are going to have to rely on something like rbenv, rvm, pyenv, etc.

I am happy for you that you do not have to deal with this problem as much as I do, but that is not actually very helpful to me since thanks to our particular dependencies I do run into this quite frequently and it’s usually a significant amount of work to update.

I was not trying to be cheeky, simply sharing my experience as someone who maintains a very large number of sensu plugins. Maybe there are some decisions that are being made in your custom plugins that cause this to manifest more often. Could you maybe share some examples so we can see if maybe we can make some changes and reduce the likelihood of conflicts? In the few cases where I have seen issues they usually stem from lack of version pinning and general maintenance/upkeep. If you do not feel comfortable sharing them publicly you can send them to me@benabrams.it and I will try to review when I have some time. If there is anything we can do better on the community plugins that you consume and cause problems I am all ears.

One more thing I will mention is that you could look at some of these projects:

Hey everyone,
Sorry for the very late reply, I wanted to include a new pattern I’ve been using to solve this problem in case a new person finds this in a Google search.

So real quick…for Sensu Core, you can isolate Sensu plugin gems as if they were an “app” and do a bundle install. We currently use this particular trick to build ruby based Sensu asset binaries for the existing Sensu community sensu-plugins
For example:

This example dockerfile creates a Gemfile for the check you want to install
then it runs bundler standalone with the path and binstubs directory you want

https://jbodah.github.io/blog/2017/11/22/using-bundler-standalone/

Now you don’t have to do that inside a docker container, we do that as part of CI automation to build Sensu plugin assets consistently.

You can do the Gemfile and bundle install steps in your own config management or live environment to get the isolated gem installs you need. I think this should let you isolate plugins. The key idea here is bundle is being instructed to install all needed deps into a special isolated location… not into the system ruby…and generated binstubs that do the path munging necessary to find those installed deps.

So I know this is originally a Sensu Core issue, but just wanted to also throw out how I’d solve this in Sensu Go.

Because Sensu Go lets you stack assets, it should be very easy to build your own ruby runtime asset for internal use, and build ruby plugin based assets that work against them.
You’ll have to keep up with your own internal dependency tracking to use the matching runtime, but essentially, if you follow the containerized build workflow pattern, you’ll end up with Sensu assets of the plugin gems you want that are isolated from each other entirely, so you can update each as needed.

This works really well for containerized Sensu Go deployments, so you don’t have to rebuild the image to update the ruby runtime, you update the assets instead and let the agent or backend pull the asset as needed.

-jef

1 Like