Log monitoring


#1

Hello!

I’m trying to implement sensu-based log checking for my infrastructure.

Requirements are:

  1. System should checks in log4j/NLog format. Event should be fired when log line level ERROR or FATAL appeared in the log.

  2. System should not clear event, only admin can delete event saying ‘ok, I’ve seen it’

  3. System should not fire error on the same log line twice.

  4. Logs should not be modified by monitoring.

Ideal implementation looks like this:

  1. Sensu checks logs from latest mark to the end of log.

  2. If some error or exception found - warning or error condition raised, mark is not moved. If no error found or previous event deleted by user - mark is moved to current end of log.

  3. Administrator looks at the error description, performs some action and deletes event from sensu. No log modification is performed.

Unfortunately this method cannot be implemented with current Sensu because event handler cannot detect situation when event was removed by user.

Am I missing something and I can implement requirements some other way?

Regards,

Alik.


#2

If you really want to literally fire an event whenever a certain line
is found, and not automatically clear the event,
then I would make the check utilize the local socket to send an
arbitrary event, and only send CRIT, and never OK.

If you want to make sure that there are separate events and never fire
for the same log line twice, hash the line and put the hash in the
check name.

1. Sensu check that reads the log and marks an offset
(http://linux.die.net/man/8/logtail) (check_name = check_the_log)
2. If an error is found, send a *new* event to the localhost socket
with part of the hash of the line in the check_name. (check_name =
logline_found_$hash)
3. Congrats, your dashboard will be full of events that will never
clear, no logs modified.

Working with the localhost socket is and advanced topic, I have a PR
to have it documented here:

···

On Mon, Oct 6, 2014 at 1:33 AM, Alik Kurdyukov <akurdyukov@gmail.com> wrote:

Hello!

I'm trying to implement sensu-based log checking for my infrastructure.

Requirements are:
1. System should checks in log4j/NLog format. Event should be fired when log
line level ERROR or FATAL appeared in the log.
2. System should not clear event, only admin can delete event saying 'ok,
I've seen it'
3. System should not fire error on the same log line twice.
4. Logs should not be modified by monitoring.

Ideal implementation looks like this:
1. Sensu checks logs from latest mark to the end of log.
2. If some error or exception found - warning or error condition raised,
mark is not moved. If no error found or previous event deleted by user -
mark is moved to current end of log.
3. Administrator looks at the error description, performs some action and
deletes event from sensu. No log modification is performed.

Unfortunately this method cannot be implemented with current Sensu because
event handler cannot detect situation when event was removed by user.

Am I missing something and I can implement requirements some other way?

Regards,
Alik.


#3

I’ve been wondering about the same thing for windows … FYI …

logtail with offset for winderz.

http://logtail-v3.sourceforge.net/

···

On Mon, Oct 6, 2014 at 11:03 AM, Kyle Anderson kyle@xkyle.com wrote:

If you really want to literally fire an event whenever a certain line

is found, and not automatically clear the event,

then I would make the check utilize the local socket to send an

arbitrary event, and only send CRIT, and never OK.

If you want to make sure that there are separate events and never fire

for the same log line twice, hash the line and put the hash in the

check name.

  1. Sensu check that reads the log and marks an offset

(http://linux.die.net/man/8/logtail) (check_name = check_the_log)

  1. If an error is found, send a new event to the localhost socket

with part of the hash of the line in the check_name. (check_name =

logline_found_$hash)

  1. Congrats, your dashboard will be full of events that will never

clear, no logs modified.

Working with the localhost socket is and advanced topic, I have a PR

to have it documented here:

https://github.com/sensu/sensu-docs/pull/132

On Mon, Oct 6, 2014 at 1:33 AM, Alik Kurdyukov akurdyukov@gmail.com wrote:

Hello!

I’m trying to implement sensu-based log checking for my infrastructure.

Requirements are:

  1. System should checks in log4j/NLog format. Event should be fired when log

line level ERROR or FATAL appeared in the log.

  1. System should not clear event, only admin can delete event saying 'ok,

I’ve seen it’

  1. System should not fire error on the same log line twice.
  1. Logs should not be modified by monitoring.

Ideal implementation looks like this:

  1. Sensu checks logs from latest mark to the end of log.
  1. If some error or exception found - warning or error condition raised,

mark is not moved. If no error found or previous event deleted by user -

mark is moved to current end of log.

  1. Administrator looks at the error description, performs some action and

deletes event from sensu. No log modification is performed.

Unfortunately this method cannot be implemented with current Sensu because

event handler cannot detect situation when event was removed by user.

Am I missing something and I can implement requirements some other way?

Regards,

Alik.


#4

Hey Alik,

Sensu is good at periodic, stateless checks. Looking inside a log is not stateless (you have to remember the last position).

Even if you create a check that keeps this state and always starts at the last read position, you’ll run into other problems: what if the file was rotated since the last run? Do you want to go look at the rotated file to make sure you didn’t miss anything? What if the rotated file is gzipped? What if rotation happens while you’re looking at the log? Do you need to reopen the new log now to keep looking, or can this wait to the next run of the check?

Your life will be easier if you consider logs as streams, not as files.

For checks based on log events, you’re looking for a log processing pipeline like LogStash, I would think. Log shippers are inherently made to support log rotation and read everything as it comes in. Then LogStash can take an action every time it sees a specific pattern in a log (perform API call, send email, forward log event somewhere else, you pick). Bonus: now you have a log processing pipeline!

Cheers,

Mat

···

On Mon, Oct 6, 2014 at 11:03 AM, Kyle Anderson kyle@xkyle.com wrote:

If you really want to literally fire an event whenever a certain line

is found, and not automatically clear the event,

then I would make the check utilize the local socket to send an

arbitrary event, and only send CRIT, and never OK.

If you want to make sure that there are separate events and never fire

for the same log line twice, hash the line and put the hash in the

check name.

  1. Sensu check that reads the log and marks an offset

(http://linux.die.net/man/8/logtail) (check_name = check_the_log)

  1. If an error is found, send a new event to the localhost socket

with part of the hash of the line in the check_name. (check_name =

logline_found_$hash)

  1. Congrats, your dashboard will be full of events that will never

clear, no logs modified.

Working with the localhost socket is and advanced topic, I have a PR

to have it documented here:

https://github.com/sensu/sensu-docs/pull/132

On Mon, Oct 6, 2014 at 1:33 AM, Alik Kurdyukov akurdyukov@gmail.com wrote:

Hello!

I’m trying to implement sensu-based log checking for my infrastructure.

Requirements are:

  1. System should checks in log4j/NLog format. Event should be fired when log

line level ERROR or FATAL appeared in the log.

  1. System should not clear event, only admin can delete event saying 'ok,

I’ve seen it’

  1. System should not fire error on the same log line twice.
  1. Logs should not be modified by monitoring.

Ideal implementation looks like this:

  1. Sensu checks logs from latest mark to the end of log.
  1. If some error or exception found - warning or error condition raised,

mark is not moved. If no error found or previous event deleted by user -

mark is moved to current end of log.

  1. Administrator looks at the error description, performs some action and

deletes event from sensu. No log modification is performed.

Unfortunately this method cannot be implemented with current Sensu because

event handler cannot detect situation when event was removed by user.

Am I missing something and I can implement requirements some other way?

Regards,

Alik.


#5

Guys,

Thank you for such insightful advices, I managed to set everyone up using Graylog2 + my custom output plugin (https://github.com/akurdyukov/sensu-client-output) and it’s working. The only problem left is configuration bug in ‘create new event for every new line’, hope guys from Graylog2 will help me fix it.

Final config works like this:

  1. NLog (application logging framework) feeds log lines into GrayLog2 using Gelf4NLog

  2. Graylog2 separates warn, errors and fatals into separate stream “Errors"

  3. Stream “Errors” is fed to local Sensu client by plugin

  4. Admin sees errors in Sensu web interface. Admin can resolve an error and be warned again only when new error log line comes.

···

On Mon, Oct 6, 2014 at 11:03 AM, Kyle Anderson kyle@xkyle.com wrote:

If you really want to literally fire an event whenever a certain line

is found, and not automatically clear the event,

then I would make the check utilize the local socket to send an

arbitrary event, and only send CRIT, and never OK.

If you want to make sure that there are separate events and never fire

for the same log line twice, hash the line and put the hash in the

check name.

  1. Sensu check that reads the log and marks an offset

(http://linux.die.net/man/8/logtail) (check_name = check_the_log)

  1. If an error is found, send a new event to the localhost socket

with part of the hash of the line in the check_name. (check_name =

logline_found_$hash)

  1. Congrats, your dashboard will be full of events that will never

clear, no logs modified.

Working with the localhost socket is and advanced topic, I have a PR

to have it documented here:

https://github.com/sensu/sensu-docs/pull/132

On Mon, Oct 6, 2014 at 1:33 AM, Alik Kurdyukov akurdyukov@gmail.com wrote:

Hello!

I’m trying to implement sensu-based log checking for my infrastructure.

Requirements are:

  1. System should checks in log4j/NLog format. Event should be fired when log

line level ERROR or FATAL appeared in the log.

  1. System should not clear event, only admin can delete event saying 'ok,

I’ve seen it’

  1. System should not fire error on the same log line twice.
  1. Logs should not be modified by monitoring.

Ideal implementation looks like this:

  1. Sensu checks logs from latest mark to the end of log.
  1. If some error or exception found - warning or error condition raised,

mark is not moved. If no error found or previous event deleted by user -

mark is moved to current end of log.

  1. Administrator looks at the error description, performs some action and

deletes event from sensu. No log modification is performed.

Unfortunately this method cannot be implemented with current Sensu because

event handler cannot detect situation when event was removed by user.

Am I missing something and I can implement requirements some other way?

Regards,

Alik.


#6

I think the machines in C and D are likely to have different monitoring
configurations, and the number of machines is a few. Also bandwidth is
precious here.

···

Outdoor Bluetooth Speaker | Bluetooth Speaker

On Monday, October 6, 2014 1:33:31 PM UTC+5, Alik Kurdyukov wrote:

Hello!

I’m trying to implement sensu-based log checking for my infrastructure.

Requirements are:

  1. System should checks in log4j/NLog format. Event should be fired when log line level ERROR or FATAL appeared in the log.
  1. System should not clear event, only admin can delete event saying ‘ok, I’ve seen it’
  1. System should not fire error on the same log line twice.
  1. Logs should not be modified by monitoring.

Ideal implementation looks like this:

  1. Sensu checks logs from latest mark to the end of log.
  1. If some error or exception found - warning or error condition raised, mark is not moved. If no error found or previous event deleted by user - mark is moved to current end of log.
  1. Administrator looks at the error description, performs some action and deletes event from sensu. No log modification is performed.

Unfortunately this method cannot be implemented with current Sensu because event handler cannot detect situation when event was removed by user.

Am I missing something and I can implement requirements some other way?

Regards,

Alik.