Recommendations on how to migrate metrics checks

#1

Intro
there was a great conversation on Slack today concerning some confusion over migrating metrics checks from Sensu Core to Sensu Go. I want to summarize the take away from the conversation here, so it’s easily discoverable by people who run into the same problem. Thanks to Casey Scott for reaching out on Slack, and for Nikki Attea for helping to clear up the confusion.

So what’s the underlying confusion?

Sensu Go events introduce a new high level metrics attribute. This means you no longer have to mutate the Sensu event if you are using a handler that knows how to natively ingest the new metrics attribute introduced in the Sensu Go event model. The community maintained plugins haven’t fully caught up with the event model changes, so people migrating may need to switch out the handlers they were using in order to take full advantage of the enhancements in Sensu Go metrics concepts.

Metric check migration example
Let’s assume we are migrating a check that makes use of the metrics-curl.rb command provided by the sensu-plugins-http plugin from the community maintained Sensu Plugins collection. This command returns graphite plaintext formatted metrics to stdout. Additionally, I’ll assume we want to send these metrics into an influxdb timeseries database.

Previously, in Sensu Core you would have been required to mutate the event on the central Sensu Core server prior to sending the output to influxdb handler. The mutate step would be required to coerce the metrics data in the check output into a format that the influxdb could ingest.

In Sensu Go, the Sensu agent can be instructed to extract the metric information from the command output and populate the Sensu Go event’s metrics attribute before sending it to the Sensu backend server. This is done by setting the check’s output_metrics_format attribute to match the expect metric format. The agent knows how to extract several common formats including graphite plaintext and nagios perfdata, among others.

The Sensu Go check definition also provides an output_metrics_handlers attribute where a list of metrics handlers can be defined, separate from the handlers list that is meant to operate on return status. Having different handlers for status and metrics makes it possible to have different workflows for metrics and status,without having to use complicated conditional filter logic. For this migration example, it would appropriate to set the output_metrics_handlers to include the influxdb handler and reserve the handlers attribute for a handler to use if the check returned an error status, which would be indicative of a more urgent concern in need of remediation.

For Sensu Go, it’s preferable an influxdb handler that understood how to parse the Sensu event’s metrics attribute. At present the best choice for this is the sensu-influxdb-handler, written in golang, and designed to work with the new metrics attribute available in the Sensu Go event model. No mutator required, the Sensu Go event’s metrics attribute act as an intermediary format, reducing the complexity necessary to write effective handlers without increasing operational complexity by requiring operators to write mutator logic.

The ruby metrics-influxdb.rb handler provided as part of the sensu-plugins-influxdb plugin commonly used in the Sensu Core configuration, does not know (at the time I’m writing this) how to parse the Sensu Go event model and would still need to rely on a mutator to work correctly. If you are interested in continuing to use the same handler, there’s an opportunity to help contribute to the plugin and enhance it to parse the Sensu Go event model.

Summary of recommendations for migrating metrics checks

  • Let the Sensu Go agent extract metrics from check commands using output_metrics_format check attribute, The agent will populate the Sensu event’s metrics attribute prior to sending it to the Sensu backend server.
  • Set the metrics handler you want to use as part of the check output_metrics_handlers and use handlers exclusively for workflows that require alerting on check return status
  • Use sensu-influxdb-handler to send metrics into infludb, as it knows how to parse the Sensu Go event’s metrics attribute.