Best practice for monitoring large number of switches and routers

Hi,

We have been using nagios for monitoring large number (approx 200) of switches and routers, and planning to move to sensu. Bare minimum functionality is to just do ping monitoring for link availability, and it’s associated alerting.

What is the best practice in monitoring such a network. Do I run a sensu client in one machine and it pings all the switches? Does such an approach scale? How about running a separate program to ping all clients and send result to sensu client using socket API, with source attribute set?

Any better ideas?

Thanks!

raj

I've done both approaches. The first one scales "ok". 200 should be ok
for that kind of pattern.
Remember you can set the "source" on standalone or subscription checks too.
Also you don't have to have a SPOF monitoring server, you can use
round-robin subscriptions
https://sensuapp.org/docs/0.24/reference/clients.html#round-robin-client-subscriptions
To have a kind of "HA-style" "pinging cluster" if that makes sense.

The second approach (separate program, send results to the socket)
does scale better.
Also you get the benefit of using an "easy to use source of truth", if
you have such a thing, regarding what switches are out there.
Here is a similiar but older example of this:

This needs to be documented I guess, as it is a common question.

···

On Thu, Jun 9, 2016 at 4:51 AM, Raj <rajlistuser@gmail.com> wrote:

Hi,

We have been using nagios for monitoring large number (approx 200) of
switches and routers, and planning to move to sensu. Bare minimum
functionality is to just do ping monitoring for link availability, and it's
associated alerting.

What is the best practice in monitoring such a network. Do I run a sensu
client in one machine and it pings all the switches? Does such an approach
scale? How about running a separate program to ping all clients and send
result to sensu client using socket API, with source attribute set?

Any better ideas?

Thanks!

raj