Unable to get system metrics

Sensu Go 6.6.1

This is the check:

---
type: CheckConfig
api_version: core/v2
metadata:
  name: collect-system-metrics
spec:
  check_hooks: null
  command: system-check
  env_vars: null
  handlers: []
  high_flap_threshold: 0
  interval: 10
  low_flap_threshold: 0
  output_metric_format: influxdb_line
  output_metric_handlers:
  - pipe_handler_minimum
  - influxdb-handler
  pipelines: []
  proxy_entity_name: ""
  publish: true
  round_robin: false
  runtime_assets:
  - system-check
  secrets: null
  stdin: false
  subdue: null
  subscriptions:
  - system-check
  timeout: 0
  ttl: 0

This is the handler:

---
type: Handler
api_version: core/v2
metadata:
  created_by: admin
  name: influxdb-handler
  namespace: default
spec:
  command: sensu-influxdb-handler -d sensu -l
  env_vars:
  - INFLUXDB_ADDR=http://10.2.82.167:8086
  filters: null
  handlers: null
  runtime_assets:
  - sensu-influxdb-handler
  secrets: null
  timeout: 0
  type: pipe

This is what I get in syslog:

Dec  8 01:49:53 sensu-01 sensu-agent[14252]: {"check":"collect-system-metrics","component":"agent","error":"unable to extract metric from check output","level":"error","line":1,"msg":"influxdb line format requires 2 arguments with a 3rd (optional) timestamp","namespace":"default","time":"2021-12-08T01:49:53Z"}
Dec  8 01:49:53 sensu-01 sensu-agent[14252]: {"check":"collect-system-metrics","component":"agent","error":"unable to extract metric from check output","level":"error","line":2,"msg":"metric field set is invalid, must contain a key=value pair","namespace":"default","time":"2021-12-08T01:49:53Z"}

This is a chunk of output from the system-check executable:

# ./system-check | head
# HELP system_cpu_cores [GAUGE] Number of cpu cores on the system
# TYPE system_cpu_cores GAUGE
system_cpu_cores{} 2 1638928235661
# HELP system_cpu_idle [GAUGE] Percent of time all cpus were idle
# TYPE system_cpu_idle GAUGE
system_cpu_idle{cpu="cpu0"} 96.94915254236494 1638928235661
system_cpu_idle{cpu="cpu1"} 98.98305084737528 1638928235661
system_cpu_idle{cpu="cpu-total"} 97.96610169487136 1638928235661
# HELP system_cpu_used [GAUGE] Percent of time all cpus were used
# TYPE system_cpu_used GAUGE

And this is the output from the debug handler:

{
  "sequence": 1646,
  "pipelines": null,
  "timestamp": 1638928283,
  "entity": {
    "entity_class": "agent",
    "system": {
      "hostname": "sensu-01",
      "os": "linux",
      "platform": "ubuntu",
      "platform_family": "debian",
      "platform_version": "20.04",
      "network": {
        "interfaces": [
          {
            "name": "lo",
            "addresses": [
              "127.0.0.1/8"
            ]
          },
          {
            "name": "eth0",
            "mac": "02:b6:4f:55:f7:f9",
            "addresses": [
              "10.2.82.167/21",
              "fe80::b6:4fff:fe55:f7f9/64"
            ]
          }
        ]
      },
      "arch": "amd64",
      "libc_type": "glibc",
      "vm_system": "xen",
      "vm_role": "guest",
      "cloud_provider": "EC2",
      "processes": null
    },
    "subscriptions": [
      "http-checks-remote",
      "system-check",
      "entity:sensu-01"
    ],
    "last_seen": 1638928280,
    "deregister": false,
    "deregistration": {
      "handler": "slack"
    },
    "user": "spree3d",
    "redact": [
      "password",
      "passwd",
      "pass",
      "api_key",
      "api_token",
      "access_key",
      "secret_key",
      "private_key",
      "secret"
    ],
    "metadata": {
      "name": "sensu-01",
      "namespace": "default",
      "labels": {
        "sensu.io/managed_by": "sensu-agent"
      }
    },
    "sensu_agent_version": "6.6.1",
    "keepalive_handlers": [
      "slack"
    ]
  },
  "check": {
    "command": "system-check",
    "handlers": [],
    "high_flap_threshold": 0,
    "interval": 10,
    "low_flap_threshold": 0,
    "publish": true,
    "runtime_assets": [
      "system-check"
    ],
    "subscriptions": [
      "system-check"
    ],
    "proxy_entity_name": "",
    "check_hooks": null,
    "stdin": false,
    "subdue": null,
    "ttl": 0,
    "timeout": 0,
    "round_robin": false,
    "duration": 3.007705834,
    "executed": 1638928280,
    "history": [
      {
        "status": 0,
        "executed": 1638928080
      },
      {
        "status": 0,
        "executed": 1638928090
      },
      {
        "status": 0,
        "executed": 1638928100
      },
      {
        "status": 0,
        "executed": 1638928110
      },
      {
        "status": 0,
        "executed": 1638928120
      },
      {
        "status": 0,
        "executed": 1638928130
      },
      {
        "status": 0,
        "executed": 1638928140
      },
      {
        "status": 0,
        "executed": 1638928150
      },
      {
        "status": 0,
        "executed": 1638928160
      },
      {
        "status": 0,
        "executed": 1638928170
      },
      {
        "status": 0,
        "executed": 1638928180
      },
      {
        "status": 0,
        "executed": 1638928190
      },
      {
        "status": 0,
        "executed": 1638928200
      },
      {
        "status": 0,
        "executed": 1638928210
      },
      {
        "status": 0,
        "executed": 1638928220
      },
      {
        "status": 0,
        "executed": 1638928230
      },
      {
        "status": 0,
        "executed": 1638928240
      },
      {
        "status": 0,
        "executed": 1638928250
      },
      {
        "status": 0,
        "executed": 1638928260
      },
      {
        "status": 0,
        "executed": 1638928270
      },
      {
        "status": 0,
        "executed": 1638928280
      }
    ],
    "issued": 1638928280,
    "output": "# HELP system_cpu_cores [GAUGE] Number of cpu cores on the system\n# TYPE system_cpu_cores GAUGE\nsystem_cpu_cores{} 2 1638928280182\n# HELP system_cpu_idle [GAUGE] Percent of time all cpus were idle\n# TYPE system_cpu_idle GAUGE\nsystem_cpu_idle{cpu=\"cpu0\"} 97.97979797975468 1638928280182\nsystem_cpu_idle{cpu=\"cpu1\"} 98.6394557822843 1638928280182\nsystem_cpu_idle{cpu=\"cpu-total\"} 98.63945578240636 1638928280182\n# HELP system_cpu_used [GAUGE] Percent of time all cpus were used\n# TYPE system_cpu_used GAUGE\nsystem_cpu_used{cpu=\"cpu0\"} 2.0202020202453213 1638928280182\nsystem_cpu_used{cpu=\"cpu1\"} 1.3605442177156988 1638928280182\nsystem_cpu_used{cpu=\"cpu-total\"} 1.3605442175936417 1638928280182\n# HELP system_cpu_user [GAUGE] Percent of time total cpu was used by normal processes in user mode\n# TYPE system_cpu_user GAUGE\nsystem_cpu_user{cpu=\"cpu0\"} 0.673400673401711 1638928280182\nsystem_cpu_user{cpu=\"cpu1\"} 0.34013605442119 1638928280182\nsystem_cpu_user{cpu=\"cpu-total\"} 0.5102040816324163 1638928280182\n# HELP system_cpu_system [GAUGE] Percent of time all cpus used by processes executed in kernel mode\n# TYPE system_cpu_system GAUGE\nsystem_cpu_system{cpu=\"cpu0\"} 0.673400673399797 1638928280182\nsystem_cpu_system{cpu=\"cpu1\"} 0.3401360544221567 1638928280182\nsystem_cpu_system{cpu=\"cpu-total\"} 0.3401360544216109 1638928280182\n# HELP system_cpu_nice [GAUGE] Percent of time all cpus used by niced processes in user mode\n# TYPE system_cpu_nice GAUGE\nsystem_cpu_nice{cpu=\"cpu0\"} 0 1638928280182\nsystem_cpu_nice{cpu=\"cpu1\"} 0 1638928280182\nsystem_cpu_nice{cpu=\"cpu-total\"} 0 1638928280182\n# HELP system_cpu_iowait [GAUGE] Percent of time all cpus waiting for I/O to complete\n# TYPE system_cpu_iowait GAUGE\nsystem_cpu_iowait{cpu=\"cpu0\"} 0.33670033670037697 1638928280182\nsystem_cpu_iowait{cpu=\"cpu1\"} 0.3401360544216734 1638928280182\nsystem_cpu_iowait{cpu=\"cpu-total\"} 0.3401360544216109 1638928280182\n# HELP system_cpu_irq [GAUGE] Percent of time all cpus servicing interrupts\n# TYPE system_cpu_irq GAUGE\nsystem_cpu_irq{cpu=\"cpu0\"} 0 1638928280182\nsystem_cpu_irq{cpu=\"cpu1\"} 0 1638928280182\nsystem_cpu_irq{cpu=\"cpu-total\"} 0 1638928280182\n# HELP system_cpu_sortirq [GAUGE] Percent of time all cpus servicing software interrupts\n# TYPE system_cpu_sortirq GAUGE\nsystem_cpu_sortirq{cpu=\"cpu0\"} 0.33670033670013777 1638928280182\nsystem_cpu_sortirq{cpu=\"cpu1\"} 0.3401360544214317 1638928280182\nsystem_cpu_sortirq{cpu=\"cpu-total\"} 0.1700680272109263 1638928280182\n# HELP system_cpu_stolen [GAUGE] Percent of time all cpus serviced virtual hosts operating systems\n# TYPE system_cpu_stolen GAUGE\nsystem_cpu_stolen{cpu=\"cpu0\"} 0 1638928280182\nsystem_cpu_stolen{cpu=\"cpu1\"} 0 1638928280182\nsystem_cpu_stolen{cpu=\"cpu-total\"} 0 1638928280182\n# HELP system_cpu_guest [GAUGE] Percent of time all cpus serviced guest operating system\n# TYPE system_cpu_guest GAUGE\nsystem_cpu_guest{cpu=\"cpu0\"} 0 1638928280182\nsystem_cpu_guest{cpu=\"cpu1\"} 0 1638928280182\nsystem_cpu_guest{cpu=\"cpu-total\"} 0 1638928280182\n# HELP system_cpu_guest_nice [GAUGE] Percent of time all cpus serviced niced guest operating system\n# TYPE system_cpu_guest_nice GAUGE\nsystem_cpu_guest_nice{cpu=\"cpu0\"} 0 1638928280182\nsystem_cpu_guest_nice{cpu=\"cpu1\"} 0 1638928280182\nsystem_cpu_guest_nice{cpu=\"cpu-total\"} 0 1638928280182\n# HELP system_mem_used [GAUGE] Percent of memory used\n# TYPE system_mem_used GAUGE\nsystem_mem_used{} 12.461227268926468 1638928280182\n# HELP system_mem_used_bytes [GAUGE] Used memory in bytes\n# TYPE system_mem_used_bytes GAUGE\nsystem_mem_used_bytes{} 5.12745472e+08 1638928280182\n# HELP system_mem_total_bytes [GAUGE] Total memory in bytes\n# TYPE system_mem_total_bytes GAUGE\nsystem_mem_total_bytes{} 4.114726912e+09 1638928280182\n# HELP system_swap_used [GAUGE] Percent of swap used\n# TYPE system_swap_used GAUGE\nsystem_swap_used{} 0 1638928280182\n# HELP system_swap_used_bytes [GAUGE] Used swap in bytes\n# TYPE system_swap_used_bytes GAUGE\nsystem_swap_used_bytes{} 5.12745472e+08 1638928280182\n# HELP system_swap_total_bytes [GAUGE] Total swap in bytes\n# TYPE system_swap_total_bytes GAUGE\nsystem_swap_total_bytes{} 0 1638928280182\n# HELP system_load_load1 [GAUGE] System load averaged over 1 minute, high load value dependant on number of cpus in system\n# TYPE system_load_load1 GAUGE\nsystem_load_load1{} 0.13 1638928280182\n# HELP system_load_load5 [GAUGE] System load averaged over 5 minute, high load value dependent on number of cpus in system\n# TYPE system_load_load5 GAUGE\nsystem_load_load5{} 0.05 1638928280182\n# HELP system_load_load15 [GAUGE] System load averaged over 15 minute, high load value dependent on number of cpus in system\n# TYPE system_load_load15 GAUGE\nsystem_load_load15{} 0.03 1638928280182\n# HELP system_load_load1_per_cpu [GAUGE] System load averaged over 1 minute normalized by cpu count, values > 1 means system may be overloaded\n# TYPE system_load_load1_per_cpu GAUGE\nsystem_load_load1_per_cpu{} 0.065 1638928280182\n# HELP system_load_load5_per_cpu [GAUGE] System load averaged over 5 minute normalized by cpu count, values > 1 means system may be overloaded\n# TYPE system_load_load5_per_cpu GAUGE\nsystem_load_load5_per_cpu{} 0.025 1638928280182\n# HELP system_load_load15_per_cpu [GAUGE] System load averaged over 15 minute normalized by cpu count, values > 1 means system may be overloaded\n# TYPE system_load_load15_per_cpu GAUGE\nsystem_load_load15_per_cpu{} 0.015 1638928280182\n# HELP system_host_uptime [COUNTER] Host uptime in seconds\n# TYPE system_host_uptime COUNTER\nsystem_host_uptime{} 18639 1638928280182\n# HELP system_host_processes [GAUGE] Number of host processes\n# TYPE system_host_processes GAUGE\nsystem_host_processes{} 119 1638928280182\n\n",
    "state": "passing",
    "status": 0,
    "total_state_change": 0,
    "last_ok": 1638928280,
    "occurrences": 1646,
    "occurrences_watermark": 1646,
    "output_metric_format": "influxdb_line",
    "output_metric_handlers": [
      "pipe_handler_minimum",
      "influxdb-handler"
    ],
    "env_vars": null,
    "metadata": {
      "name": "collect-system-metrics",
      "namespace": "default",
      "labels": {
        "sensu.io/managed_by": "sensuctl"
      }
    },
    "secrets": null,
    "is_silenced": false,
    "scheduler": "memory",
    "processed_by": "sensu-01",
    "pipelines": []
  },
  "metrics": {
    "handlers": [
      "pipe_handler_minimum",
      "influxdb-handler"
    ],
    "points": null
  },
  "metadata": {
    "namespace": "default"
  },
  "id": "8992405f-f454-4a0c-a78c-5d289380d388"
}

What’s going on? I’ve tried literally every handler out there - Prometheus, Graphite, etc - nothing works.

The output_metric_format setting is specific to the check, not the handler. It tells the Sensu agent what format of metrics the check will output so it can parse the them at the edge for optimized downstream processing.

This plugin generates metrics in Prometheus format, so output_metric_format: prometheus_text should do the trick.

I hope this helps!