Some questions on Sensu silencing hosts/checks and visualization


#1

Hi guys,
first of all thanks for the great work you are doing with Sensu. It’s awesome because it’s damn simple and still powerful.

Where I work we have just decided to reengineer the whole monitoring infrastructure and I have been looking around for some simple and effective solution that does not date back to '90, i.e. Nagios and similar.

I am experimenting a bit with a number of VMs and I came up with some questions I did not manage to answer myself nor the documentation does. Maybe somebody here can help out with them:

  1. Is it possibile to have associate custom properties to hosts or checks? That would be nice in order to filter out records bases on properties.
  2. It is not clear what silence check/host does. Ok, it creates a stash for later use from handlers, but I would have expected the event to be somehow grayed out or put out of sight. What I want to achieve is saying “OK, I know this check/host is critical but do not bother me anymore, I will tell you later when”
  3. Is there a way to group the hosts somehow? For instance I would like to have dashboard or a screen with just the production environment hosts/checks. The ops team wants to know if a business service is OK in production somehow. Maybe the sensu way to do it is to filter on queues?

Hope we are gonna use it so I can have my change to actively contribute, I am really looking forward to it.

Best,

Marco


#2

Is it possibile to have associate custom properties to hosts or checks? That
would be nice in order to filter out records bases on properties.

You can add custom properties in the client json, and your handlers
can operate on it.
However, that metadata is not currently available to the dashboard, as
it is not fully exposed via the event api.
I'm pretty sure that will change with sensu .13 per portertech?

It is not clear what silence check/host does. Ok, it creates a stash for
later use from handlers, but I would have expected the event to be somehow
grayed out or put out of sight. What I want to achieve is saying "OK, I know
this check/host is critical but do not bother me anymore, I will tell you
later when"

Yea. Right now just the little speaker icon is the only indication.
I'm also holding out for a better dashboard to make this better.

I'm holding my breath for this: https://github.com/palourde/uchiwa

Is there a way to group the hosts somehow? For instance I would like to
have dashboard or a screen with just the production environment
hosts/checks. The ops team wants to know if a business service is OK in
production somehow. Maybe the sensu way to do it is to filter on queues?

I agree that this also would be a kill feature. It is not currently
possible with the basic stock sensu dashboard.

The dashboard would also be super useful with permalinks. So maybe you
could tag all the hosts or services that belong to the ops team:
http://sensu/?team=ops

And then they only see what they need?

Yes, I hope, like graphite, Sensu gets a wave of custom dashboards to
meet lots of different needs. Again this will be better when the next
release of sensu is out and the full event api is available.
My personal wishlist is:

* Easy subdue/silence easily applied to multiple services/hosts
* Multi API support (I have many sensu clusters)
* Permalinks
* Filters on arbitrary, user provided tags. (team, datacenter,
runtime, service_name, who knows, don't care, tags I can't even think
of yet)
* Bonus points for a good javascripty experience (Nagios and kin is
not a hard bar to meet)

* Maybe bonus points for providing some mechanism to enabling
extensions so I can have hosts expose links, or images, or text about
the service or host,
like a link to a graphite dashboard, or a have the json of a disk
check include a link to a graphite graph of the disk usage over time,
or link to a pager duty incident or something?
This could be all feature creep, I don't know if I really want this.

···

On Tue, Jul 15, 2014 at 11:05 AM, Marco Tizzoni <elibus@gmail.com> wrote: