Inserting nagios check return values into InfluxDB

I’m labbing up Sensu at the moment with a check that formats data in nagios check format, the check is working and retuning status and values, what I now need to do is take those values and store them in influx db. What I seem to need is a mutator that can format the output as separate influx vales. I’m assuming this has already been done though I can’t find a topic that covers it specifically.
For the moment I’m just using the debug function, as it stands the json output shows the return line but only creates a value for the number of successful itterations.

sensuctl asset add nixwiz/sensu-check-status-metric-mutator
sensuctl mutator create status-metric --namespace default -c “sensu-check-status-metric-mutator” -r “nixwiz/sensu-check-status-metric-mutator”
sensuctl asset add sensu/sensu-influxdb-handler

Create Check

sensuctl check create check-path -c “check-path.bin -t 3 8.8.8.8” -s “sla-sub” -i “10”
sensuctl check set-output-metric-format check-path nagios_perfdata

Create Debug handler

sensuctl handler create debug --type pipe --command “cat | python3 -m json.tool > /var/log/sensu/debug-event.json”
sensuctl check set-handlers check-path debug

1 Like
[root@sensuapp sensu]# cat debug-event.json
{
"timestamp": 1602120925,
"entity": {
    "entity_class": "agent",
    "system": {
        "hostname": "agent2",
        "os": "linux",
        "platform": "centos",
        "platform_family": "rhel",
        "platform_version": "8.0.1905",
        "network": {
            "interfaces": [
                {
                    "name": "lo",
                    "addresses": [
                        "127.0.0.1/8",
                        "::1/128"
                    ]
                },
                {
                    "name": "eth0",
                    "mac": "52:54:00:72:fe:6e",
                    "addresses": [
                        "10.0.2.15/24",
                        "fe80::5054:ff:fe72:fe6e/64"
                    ]
                },
                {
                    "name": "eth1",
                    "mac": "08:00:27:75:10:15",
                    "addresses": [
                        "10.0.0.34/24",
                        "fe80::a00:27ff:fe75:1015/64"
                    ]
                }
            ]
        },
        "arch": "amd64"
    },
    "subscriptions": [
        "sla-sub",
        "entity:agent2"
    ],
    "last_seen": 1602120921,
    "deregister": false,
    "deregistration": {
        "handler": "example_handler"
    },
    "user": "admin",
    "redact": [
        "password",
        "passwd",
        "pass",
        "api_key",
        "api_token",
        "access_key",
        "secret_key",
        "private_key",
        "secret"
    ],
    "metadata": {
        "name": "agent2",
        "namespace": "default"
    }
},
"check": {
    "command": "check-path.bin -t 3 8.8.8.8",
    "handlers": [
        "debug"
    ],
    "high_flap_threshold": 0,
    "interval": 10,
    "low_flap_threshold": 0,
    "publish": true,
    "runtime_assets": null,
    "subscriptions": [
        "sla-sub"
    ],
    "proxy_entity_name": "",
    "check_hooks": null,
    "stdin": false,
    "subdue": null,
    "ttl": 0,
    "timeout": 0,
    "round_robin": false,
    "duration": 3.5689155809999997,
    "executed": 1602120921,
    "history": [
        {
            "status": 0,
            "executed": 1602120721
        },
        {
            "status": 0,
            "executed": 1602120731
        },
        {
            "status": 0,
            "executed": 1602120741
        },
        {
            "status": 0,
            "executed": 1602120751
        },
        {
            "status": 0,
            "executed": 1602120761
        },
        {
            "status": 0,
            "executed": 1602120771
        },
        {
            "status": 0,
            "executed": 1602120781
        },
        {
            "status": 0,
            "executed": 1602120791
        },
        {
            "status": 0,
            "executed": 1602120801
        },
        {
            "status": 0,
            "executed": 1602120811
        },
        {
            "status": 0,
            "executed": 1602120821
        },
        {
            "status": 0,
            "executed": 1602120831
        },
        {
            "status": 0,
            "executed": 1602120841
        },
        {
            "status": 0,
            "executed": 1602120851
        },
        {
            "status": 0,
            "executed": 1602120861
        },
        {
            "status": 0,
            "executed": 1602120871
        },
        {
            "status": 0,
            "executed": 1602120881
        },
        {
            "status": 0,
            "executed": 1602120891
        },
        {
            "status": 0,
            "executed": 1602120901
        },
        {
            "status": 0,
            "executed": 1602120911
        },
        {
            "status": 0,
            "executed": 1602120921
        }
    ],
    "issued": 1602120921,
    "output": "HOST OK - RTT 6.363ms; | hop=1,addr=10.0.2.2,rtt=0.285ms;\nhop=2,addr=10.16.4.1,rtt=1.729ms;\nhop=3,addr=203.219.198.39,rtt=6.255ms;\nhop=4,addr=203.29.134.61,rtt=5.674ms;\nhop=5,addr=209.85.149.84,rtt=6.333ms;\nhop=6,addr=108.170.247.65,rtt=5.799ms;\nhop=7,addr=209.85.254.119,rtt=5.88ms;\nhop=8,addr=8.8.8.8,rtt=6.695ms;\n",
    "state": "passing",
    "status": 0,
    "total_state_change": 0,
    "last_ok": 1602120921,
    "occurrences": 56,
    "occurrences_watermark": 56,
    "output_metric_format": "nagios_perfdata",
    "output_metric_handlers": [
        "influx-db"
    ],
    "env_vars": null,
    "metadata": {
        "name": "check-path",
        "namespace": "default"
    }
},
"metrics": {
    "handlers": [
        "influx-db"
    ],
    "points": [
        {
            "name": "check-path.status",
            "value": 0,
            "timestamp": 1602120925,
            "tags": [
                {
                    "name": "entity",
                    "value": "agent2"
                },
                {
                    "name": "check",
                    "value": "check-path"
                },
                {
                    "name": "state",
                    "value": "passing"
                },
                {
                    "name": "occurrences",
                    "value": "56"
                },
                {
                    "name": "occurrences_watermark",
                    "value": "56"
                }
            ]
        }
    ]
},
"metadata": {
    "namespace": "default"
}
}

I think this is what you’re looking for:

:+1:

1 Like

That’s what is loaded currently, however it only seems to turn the status into a metrics not the preceding values

The sensu vagrant box has been set up to this point, having real trouble getting over this hump.

I assume what you are saying is missing is all of the metrics you expect from the output of this command, correct?

Yes the metrics are there, in the debug they are showing as output, they simply haven’t been broken out into individual values under the metrics section for consumption by InfluxDB.

Where is that check from, out of curiosity? While the output certainly looks like it is in Nagios Perf Data format (status text | measurements). I’m not sure that the embedded newlines in it make it parseable as such in Sensu. I’ll have to do some testing to make sure though.

I wrote it, it should conform to the Nagios check standard but if not I can change it easily. The code is here https://github.com/davetayl/Nagios-Plugins/tree/master/check-path

The spec definitely talks about multi line output though it doesn’t define the newline format as far as I can see.

It’s definitely not being parsed properly. If you check the Sensu agent logs (journalctl -u sensu-agent.service) you will probably find something like this in it:

"error":"unable to extract metric from check output","level":"error","metric":0,"msg":"invalid nagios perfdata metric: \"hop=1,addr=10.0.2.2,rtt=0.285ms;\\nhop=2,addr=10.16.4.1,rtt=1.729ms;\\nhop=3,addr=203.219.198.39,rtt=6.255ms;\\nhop=4,addr=203.29.134.61,rtt=5.674ms;\\nhop=5,addr=209.85.149.84,rtt=6.333ms;\\nhop=6,addr=108.170.247.65,rtt=5.799ms;\\nhop=7,addr=209.85.254.119,rtt=5.88ms;\\nhop=8,addr=8.8.8.8,rtt=6.695ms;\""

In reviewing the code for parsing nagios_perfdata, it does not appear to support multi-line output. Secondly it expects, as specified in the guidelines for the perf data to be space separated. Finally, it appears you are attempting to possibly be using tags with the generated metrics. For example I would assume that for hop=1,addr=10.0.2.2,rtt=0.285ms you are wanting to capture the ‘rtt’ as the metric with the ‘hop’ and ‘addr’ being metadata (tag) for that metric. Is that correct? If so then nagios_perfdata format will not work as it doesn’t support tagging.

Might I inquire as to the purpose of this check? Are you actually doing a check for certain limits being crossed? Or is this solely for metrics collection? If it is solely for metrics collection, I would suggest creating your output in InfluxDB Line or OpenTSDB Line format. Both are supported by Sensu and support metric tagging. You could still rely on the exit value of the check creating events while you capture the metrics.

Brilliant feedback thank you.

You are correct about tagging. The test, in this case, can be run in three ways a type 1 test just gives a status and latency measure, a type 2 test returns a status with latency and a separate value for the highest latency hop to the destination, a type 3 test returns the status, and a list of all the hops with latency and should conform to a Nagios multi-line check output.

Regarding InfluxDB and OpenTSDB format, I can’t find a reference to whether they support multi-line return values.

Given that Nagios version 3 plugins support multi-line output, is there an intention to support it in Sensu in teh future?

Yes, they do support multi-line output. So, for example, your output might look like this in InfluxDB (assuming path to be your measurement):

path,hop=1,addr=10.0.2.2 rtt=0.285ms <timestamp>
path,hop=2,addr=10.16.4.1 rtt=1.729ms <timestamp>
path,hop=3,addr=203.219.198.39 rtt=6.255ms <timestamp>
path,hop=4,addr=203.29.134.61 rtt=5.674ms <timestamp>
path,hop=5,addr=209.85.149.84 rtt=6.333ms <timestamp>
path,hop=6,addr=108.170.247.65 rtt=5.799ms <timestamp>
path,hop=7,addr=209.85.254.119 rtt=5.88ms <timestamp>
path,hop=8,addr=8.8.8.8 rtt=6.695ms <timestamp>

This I’m not sure on. If support for this is something you’d like to see, I would suggest submitting a GitHub issue.

1 Like

Thanks Todd, thats a great help, just rewriting the check now to output influxdb format, we should be able to use the existing one for status using teh basic check type then the new check script for the path data.

If you are interested I’ve updated the sense vagrant build to include this

1 Like

Things are definitely looking up, however I’m a bit stuck on a nagios error

Oct 23 00:36:05 localhost sensu-agent[24183]: {"check":"check-path","component":"agent","error":"unable to extract metric from check output","level":"error","msg":"nagios perfdata format requires at least one performance data metric","namespace":"default","time":"2020-10-23T00:36:05Z"}
Oct 23 00:36:15 localhost sensu-agent[24183]: {"check":"check-path","component":"agent","error":"unable to extract metric from check output","level":"error","msg":"nagios perfdata format requires at least one performance data metric","namespace":"default","time":"2020-10-23T00:36:15Z"}

I have tried teh following return values with the same result
HOST OK - rtt = 6.081ms;
HOST OK - rtt = 6.081ms

Which should be ok from what I can see but sensu doesn’t seem to be able to derive a metric from the return

Ok so looks like this has resolved it

HOST OK - rtt = 6.118 ms | rtt=6.118

The following format includes performance values but they aren’t interpreted as such, the second half needs to be there for performance metrics to be derived.

HOST OK - rtt = 6.081ms

Is there an issue with the message below “no assets defined for this check”

Oct 23 03:20:16 localhost sensu-agent[24191]: {"assets":null,"check":"check-path-inf","component":"agent","level":"debug","msg":"no assets defined for this check","namespace":"default","time":"2020-10-23T03:20:16Z"}

hey,
debug level log messages shouldn’t be construed as errors.

If check-path-inf check doesn’t reference any runtime assets as part of the check resource configuration, this message is just relaying that there really are no assets to process for that check.