Just sharing a list of problems I've found

Spent the morning trying to figure out why I wasn’t getting alerts.
Finally solved the problem by simply restarting the backend.
In the process I came across several problems that I just wanted to share.

I’m running 5.12.0 on Ubuntu 16.04
I’m using Nagios for checks, specifically: /usr/lib/nagios/plugins/check_users -w 3 -c 5
I can easily generate alerts by logging into the client machine from several terminals simultaneously.
This was working but, for some unknown reason, stopped working.
The backend was seeing keepalives from the agent but the agent wasn’t running the checks.

Here’s the problems that I came across:

  1. Obviously, it’s a problem that I had to restart the backend to solve this problem.
  2. No matter what I tried, the agent never wrote any logs to /var/log/sensu/*
  3. Sending a signal, "sudo kill -TRAP " does not toggle debug mode. It crashes the agent.
  4. If the config file /etc/sensu/agent.yml is invalid, using systemd to start the agent will silently run with no config. Using the command line to start the agent errors out as expected.

This is just an FYI. I don’t need any further help. Thank you.

Without knowing specifics, this was most likely fixed in version 5.14

Fixed a bug that caused checks to stop executing after a network error.

See this pull request for further information:

Thanks for the info, quick summary for you based on my current understanding.

  1. the need for the restart should hopefully be fixed in 5.14.0, as previous post said.
  2. sensu-agent logs to stdout and and stderr, without no option to log into a directory.
  • For systemd based init (which includes Ubuntu 16.04), this means journald will capture the logs
  • For sysvinit based init (for older systems without systemd) the provided sysV initscript will redirect to
  • For docker container, this allows container runtime/orchestrator to collect the output and expose it via api.

here’s the docs with the details.

  1. Hmm I’m not sure what the intended design is here, I’ll poke the engineering team see if what intended.

  2. Interesting, can you provide an example invalid config. This is probably an error in our systemd unit file that can be corrected, if we can identify the problem.

Hi Jef,

Thanks for the response.

So there is an issue open for this already in the feature backlog. This might be a good enhancement for a community contributor to take a stab at implementing.