Check result not sent to server until first failure

I want to monitor status of objects in a cloud environment.

I have Sensu 0.26 client running checks on remote hosts, sending to the Sensu server (0.29).

Looking at the logs, I don’t seem to get anything on the server until the first time the check fails.

Is there some kind of filtering on the client side?

I have the following filter defined on server, in /etc/sensu/conf.d/filters.json:

{

“filters”: {

“state_change_only”: {

“negate”: false,

“attributes”: {

“occurrences”: “eval: value == 1 || ‘:::action:::’ == ‘resolve’”

}

}

}

}

``

In the client side I see something like the following (output part and command params removed for readability):

{“timestamp”:“2017-07-11T16:06:06.587807+0300”,“level”:“info”,“message”:“publishing check result”,“payload”:{“client”:“node-3”,“check”:{“command”:“check_test.py …”,“handlers”:[“file”],“interval”:15,“standalone”:true,“subscribers”:[“base”],“name”:“test-filter-02”,“issued”:1499778362,“executed”:1499778362,“duration”:4.562,“output”:"…",“status”:0}}}

``

Any help would be appreciated.

Thanks,

Yaron

After some debugging and code diving, I figured that this is currently working as intended: it does not even get to the place where filtering takes place, instead it is ignored until the first time status is not OK.

Unfortunately, this does not suit my needs, which call for a state snapshot on each object right away, even i it starts with status OK.

I therefore went ahead and made a modification to lib/server/process.rb: I added check type “snapshot”, and in update_event_registry I am testing if the check is of type “snapshot” and the event is found in redis. If it’s not there, I add it and yield(true) so process_event() gets called.

If anyone thinks this is helpful, I’ll be glad to send the patch.

בתאריך יום שלישי, 11 ביולי 2017 בשעה 16:09:17 UTC+3, מאת Yaron Yogev:

···

I want to monitor status of objects in a cloud environment.

I have Sensu 0.26 client running checks on remote hosts, sending to the Sensu server (0.29).

Looking at the logs, I don’t seem to get anything on the server until the first time the check fails.

Is there some kind of filtering on the client side?

I have the following filter defined on server, in /etc/sensu/conf.d/filters.json:

{

“filters”: {

“state_change_only”: {

“negate”: false,

“attributes”: {

“occurrences”: “eval: value == 1 || ‘:::action:::’ == ‘resolve’”

}

}

}

}

``

In the client side I see something like the following (output part and command params removed for readability):

{“timestamp”:“2017-07-11T16:06:06.587807+0300”,“level”:“info”,“message”:“publishing check result”,“payload”:{“client”:“node-3”,“check”:{“command”:“check_test.py …”,“handlers”:[“file”],“interval”:15,“standalone”:true,“subscribers”:[“base”],“name”:“test-filter-02”,“issued”:1499778362,“executed”:1499778362,“duration”:4.562,“output”:"…",“status”:0}}}

``

Any help would be appreciated.

Thanks,

Yaron

Hi Yaron,

You should be able to get the same behavior without patching by setting the type to “metric”. Each metric check result is sent a handler, even when the status is “OK”.

···

On Wednesday, July 12, 2017 at 9:01:12 AM UTC-6, Yaron Yogev wrote:

After some debugging and code diving, I figured that this is currently working as intended: it does not even get to the place where filtering takes place, instead it is ignored until the first time status is not OK.

Unfortunately, this does not suit my needs, which call for a state snapshot on each object right away, even i it starts with status OK.

I therefore went ahead and made a modification to lib/server/process.rb: I added check type “snapshot”, and in update_event_registry I am testing if the check is of type “snapshot” and the event is found in redis. If it’s not there, I add it and yield(true) so process_event() gets called.

If anyone thinks this is helpful, I’ll be glad to send the patch.

בתאריך יום שלישי, 11 ביולי 2017 בשעה 16:09:17 UTC+3, מאת Yaron Yogev:

I want to monitor status of objects in a cloud environment.

I have Sensu 0.26 client running checks on remote hosts, sending to the Sensu server (0.29).

Looking at the logs, I don’t seem to get anything on the server until the first time the check fails.

Is there some kind of filtering on the client side?

I have the following filter defined on server, in /etc/sensu/conf.d/filters.json:

{

“filters”: {

“state_change_only”: {

“negate”: false,

“attributes”: {

“occurrences”: “eval: value == 1 || ‘:::action:::’ == ‘resolve’”

}

}

}

}

``

In the client side I see something like the following (output part and command params removed for readability):

{“timestamp”:“2017-07-11T16:06:06.587807+0300”,“level”:“info”,“message”:“publishing check result”,“payload”:{“client”:“node-3”,“check”:{“command”:“check_test.py …”,“handlers”:[“file”],“interval”:15,“standalone”:true,“subscribers”:[“base”],“name”:“test-filter-02”,“issued”:1499778362,“executed”:1499778362,“duration”:4.562,“output”:"…",“status”:0}}}

``

Any help would be appreciated.

Thanks,

Yaron

I’m aware of the “metric” check type behavior. Using “metric” my handler will get called unnecessarily after the first time, so I will need to filter out all calls after the first one. I don’t see how I can do that (occurences counter works for status not being OK, right?).

‫בתאריך יום ד׳, 12 ביולי 2017 ב-18:50 מאת ‪Cameron Johnston‬‏ <‪cameron@sensu.io‬‏>:‬

···

Hi Yaron,

You should be able to get the same behavior without patching by setting the type to “metric”. Each metric check result is sent a handler, even when the status is “OK”.

On Wednesday, July 12, 2017 at 9:01:12 AM UTC-6, Yaron Yogev wrote:

After some debugging and code diving, I figured that this is currently working as intended: it does not even get to the place where filtering takes place, instead it is ignored until the first time status is not OK.

Unfortunately, this does not suit my needs, which call for a state snapshot on each object right away, even i it starts with status OK.

I therefore went ahead and made a modification to lib/server/process.rb: I added check type “snapshot”, and in update_event_registry I am testing if the check is of type “snapshot” and the event is found in redis. If it’s not there, I add it and yield(true) so process_event() gets called.

If anyone thinks this is helpful, I’ll be glad to send the patch.

בתאריך יום שלישי, 11 ביולי 2017 בשעה 16:09:17 UTC+3, מאת Yaron Yogev:

I want to monitor status of objects in a cloud environment.

I have Sensu 0.26 client running checks on remote hosts, sending to the Sensu server (0.29).

Looking at the logs, I don’t seem to get anything on the server until the first time the check fails.

Is there some kind of filtering on the client side?

I have the following filter defined on server, in /etc/sensu/conf.d/filters.json:

{

“filters”: {

“state_change_only”: {

“negate”: false,

“attributes”: {

“occurrences”: “eval: value == 1 || ‘:::action:::’ == ‘resolve’”

}

}

}

}

``

In the client side I see something like the following (output part and command params removed for readability):

{“timestamp”:“2017-07-11T16:06:06.587807+0300”,“level”:“info”,“message”:“publishing check result”,“payload”:{“client”:“node-3”,“check”:{“command”:“check_test.py …”,“handlers”:[“file”],“interval”:15,“standalone”:true,“subscribers”:[“base”],“name”:“test-filter-02”,“issued”:1499778362,“executed”:1499778362,“duration”:4.562,“output”:"…",“status”:0}}}

``

Any help would be appreciated.

Thanks,

Yaron

Hi Cameron,

Thanks for suggesting the use of the “metric” check type.

I tried to use “metric” check type and filter so I only get the first occurence and every state change.
I see that “occcurences” always equals 1, “action” always equals “create”. This means that the “state_change_only” filter defined above will not work.

I thought I could use “history”, but my filter fails

“history”: “eval: value.length == 1 || value.last != value[-2]”

``

with this error message:

“error”:“undefined method `length’ for nil:NilClass”

``

I think I better understand your use case now so I can see what you mean. Try a filter like this:

{

“filters”: {

“state_change_only”: {

“negate”: true,

“attributes”: {

“check”: {

“history”: “eval: value.last == value[-2]”

}

}

}

}

}

I have done some limited testing with this filter by sending metric type events to a local client socket. I believe this will work for your case because value[-2] returns nil when there is only one result in the history array.

Hope this helps!

Regards,

Cameron

···

On Thursday, July 13, 2017 at 7:00:55 AM UTC-6, Yaron Yogev wrote:

Hi Cameron,

Thanks for suggesting the use of the “metric” check type.

I tried to use “metric” check type and filter so I only get the first occurence and every state change.
I see that “occcurences” always equals 1, “action” always equals “create”. This means that the “state_change_only” filter defined above will not work.

I thought I could use “history”, but my filter fails

“history”: “eval: value.length == 1 || value.last != value[-2]”

``

with this error message:

“error”:“undefined method `length’ for nil:NilClass”

``