Unable to find Sensu Check on the Entity (VMs) | Check Deployment is Successful

Unable to find Sensu Check on the Entity (VMs) | Check Deployment is Successful
Entity already have some base checks
New Checks are showing in the Sensu UI but not showing it’s been added to the Entities
Entities are part of the same Subscription and it’s showing in output for the both in the following command;
# sensuctl entity info xxxxxxxx --namespace xxxxxxx --format yaml
Tried to execute the check from the Sensu console (UI), It help to get it on one of the Entity But not on the 2nd one.
Got VMs Recycled , it did not help.
Still not showing under Event for one Entity

1 Like

can you share the check definition?

Is the check using any runtime assets?

And can you share the failing entity’s subscriptions?

Are all the entities in this scenario the same OS and arch?

1 Like

Please find below ask;

can you share the check definition?

Yes Detail can be find below.

Is the check using any runtime assets?

Yes , you can find detail

runtime_assets:

  • sensu-ruby-runtime:0.0.11

  • sensu-plugins-cpu-checks:4.1.0

  • sensu-plugins-disk-checks:5.1.4

  • sensu-plugins-filesystem-checks:2.1.1

  • sensu-plugins-memory-checks:4.1.1

  • sensu-plugins-network-checks:5.0.0

  • sensu-plugins-process-checks:4.1.0

And can you share the failing entity’s subscriptions?

subscriptions:

  • datacenter/soc

  • dd.com/cm/JOE

  • environment/prod/linux

  • system/linux/redhat/redhat8

  • datacenter/soc/linux/redhat8

  • system/linux/redhat

  • cloud/vmc-2/linux/redhat8

  • cloud/vmc-2/linux

  • system/linux

  • dd.com/cm/JOE/ea

  • datacenter/soc/linux

  • environment/prod

  • cloud/vmc-2

  • entity:wmJOEaprzsdl0.prd.vmc2.dd.com

  • environment/prod/linux/redhat8

Are all the entities in this scenario the same OS and arch?

Yes Detail can be find below.

Detail

sensuctl check info app_JOE_cpu_EA_sa_user --namespace production --format yaml

type: CheckConfig

api_version: core/v2

metadata:

annotations:

fatigue_check/allow_resolution: "false"

fatigue_check/interval: "910"

fatigue_check/occurrences: "2"

dd.com/cm/documentation:

dd.com/cm/dr_mode: "false"

dd.com/cm/repo_branch: WEALTH/JOE/emplauth/monitoring.yml?at=refs%2Fheads%2Fmaster

dd.com/cm/support_action: A CPU alert has triggered for the 'APPSJOE01' user. Please please invesigate what is consuming the CPU resources.

created_by: xxxxx

labels:

malcode: JOE

stateless_event: "false"

name: app_JOE_cpu_EA_sa_user

namespace: production

spec:

check_hooks: null

command: check-cpu.rb --user APPSJOE01 -w 75 -c 80

env_vars: null

handlers:

  • base

  • cm_email

high_flap_threshold: 0

interval: 300

low_flap_threshold: 0

output_metric_format: “”

output_metric_handlers: null

proxy_entity_name: “”

publish: true

round_robin: false

runtime_assets:

  • sensu-ruby-runtime:0.0.11

  • sensu-plugins-cpu-checks:4.1.0

  • sensu-plugins-disk-checks:5.1.4

  • sensu-plugins-filesystem-checks:2.1.1

  • sensu-plugins-memory-checks:4.1.1

  • sensu-plugins-network-checks:5.0.0

  • sensu-plugins-process-checks:4.1.0

secrets: null

sddin: false

subdue: null

subscriptions:

timeout: 60

ttl: 0

1 Like

You didnt actually supply the information about the entities.
you supplied sensuctl check info
but entity information would be avaible through sensuctl entity info

1 Like

Here are the entities info a good one and one where have problem (2nd one wmJOEaprzsdl0.prd.vmc2.dd.com )

Good One Which is showing the New Check
sensuctl entity info wmJOEappzjbt0.prd.vmc2.dd.com --namespace production --format yaml
type: Entity
api_version: core/v2
metadata:
annotations:
salt_masters: “”
salt_version: 2019.2.5
Sensu | Page not found ‘{}’
Sensu | Page not found ‘{}’
dd.com/cm/provision_owner: self-serve-ops
labels:
cloud_provider: vmc-2
cloudname: vmc-2-ddc
datacenter: ddc
environment: prod
malcode: JOE
pci_compliant: “false”
sox_compliant: “true”
dd.com/cm/enabled: “True”
name: wmJOEappzjbt0.prd.vmc2.dd.com
namespace: production
spec:
deregister: false
deregistration:
handler: deregistration
entity_class: agent
last_seen: 1659645961
redact:

  • password
  • passwd
  • pass
  • api_key
  • api_token
  • access_key
  • secret_key
  • private_key
  • secret
    sensu_agent_version: 6.2.5
    subscriptions:
  • dd.com/cm/JOE
  • environment/prod/linux
  • system/linux/redhat/redhat8
  • dd.com/cm/JOE/ea
  • system/linux/redhat
  • cloud/vmc-2/linux/redhat8
  • cloud/vmc-2/linux
  • system/linux
  • datacenter/bdc/linux
  • environment/prod
  • cloud/vmc-2
  • datacenter/bdc/linux/redhat8
  • datacenter/bdc
  • entity:wmJOEappzjbt0.prd.vmc2.dd.com
  • environment/prod/linux/redhat8
    system:
    arch: amd64
    cloud_provider: “”
    hostname: wmJOEappzjbt0.prd.vmc2.dd.com
    libc_type: glibc
    network:
    interfaces:
    • addresses:
      • 10.51.201.93/21
        mac: 00:50:56:9f:ee:a0
        name: ens192
    • addresses:
      • 127.0.0.1/8
        name: lo
        os: linux
        platform: redhat
        platform_family: rhel
        platform_version: “8.5”
        processes: null
        vm_role: “”
        vm_system: “”
        user: agent

-----------Trouble where unable to see the check----
sensuctl entity info wmJOEaprzsdl0.prd.vmc2.dd.com --namespace production --format yaml
type: Entity
api_version: core/v2
metadata:
annotations:
salt_masters: “”
salt_version: 2019.2.5
Sensu | Page not found ‘{}’
Sensu | Page not found ‘{}’
dd.com/cm/provision_owner: self-serve-ops
labels:
cloud_provider: vmc-2
cloudname: vmc-2-soc
datacenter: soc
environment: prod
malcode: JOE
pci_compliant: “false”
sox_compliant: “true”
dd.com/cm/enabled: “True”
name: wmJOEaprzsdl0.prd.vmc2.dd.com
namespace: production
spec:
deregister: false
deregistration:
handler: deregistration
entity_class: agent
last_seen: 1659646298
redact:

  • password
  • passwd
  • pass
  • api_key
  • api_token
  • access_key
  • secret_key
  • private_key
  • secret
    sensu_agent_version: 6.2.5
    subscriptions:
  • datacenter/soc
  • dd.com/cm/JOE
  • environment/prod/linux
  • system/linux/redhat/redhat8
  • datacenter/soc/linux/redhat8
  • system/linux/redhat
  • cloud/vmc-2/linux/redhat8
  • cloud/vmc-2/linux
  • system/linux
  • dd.com/cm/JOE/ea
  • datacenter/soc/linux
  • environment/prod
  • cloud/vmc-2
  • entity:wmJOEaprzsdl0.prd.vmc2.dd.com
  • environment/prod/linux/redhat8
    system:
    arch: amd64
    cloud_provider: “”
    hostname: wmJOEaprzsdl0.prd.vmc2.dd.com
    libc_type: glibc
    network:
    interfaces:
    • addresses:
      • 10.45.200.247/21
        mac: 00:50:56:96:a0:ba
        name: ens192
    • addresses:
      • 127.0.0.1/8
        name: lo
        os: linux
        platform: redhat
        platform_family: rhel
        platform_version: “8.5”
        processes: null
        vm_role: “”
        vm_system: “”
        user: agent

Okay the two entities are ruing the same OS version.
I’m not seeing an obvious problem in the configuration.

I will say that you have unneeded runtime assets defined in the check.
For that check command you only need the ruby runtime asset and the
sensu-plugins-cpu-checks, but that’s probably not the problem.

Can you report back the output
sensuctl event list --format tabular |grep app_JOE_cpu_EA_sa_user

sudo sensuctl event list --format tabular --namespace production |grep -i app_JOE_cpu_EA_sa_user
wmJOEappzjbt0.prd.vmc2.td.com app_JOE_cpu_EA_sa_user CheckCPU USER OK: total=2.6 user=1.3 nice=0.0 system=1.1 idle =97.4 iowait=0.0 irq=0.1 softirq=0.1 steal=0.0 guest=0.0 guest_nice=0.0 0 false 2022-08-04 09:55:35 -0400 EDT fc8bd0e0-6aec-45e6-9231-3dfd04b5510a

okay so far I’m not seeing anything from a misconfiguration problem that would explain this.

One more thing, is the entity with the missing check event have a valid keepalive event that says the entity has been seen recently?

Is that entity successfully running any other check and producing events?

Thank you so much for taking look at this issue. It’s got resolved.
Our L2 found performance degradation on the backend , recycled the backend and it got fixed.
Yes , the keepalive has always been running along with other checks. Only issue was the new checks.