High CPU utilization issue after creating a large number of checks (25+)

Hello Everyone,

I’m using Sensu to monitor my server, and I’ve set up over 25 checks. As many checks are running at the same time, CPU utilization is extremely high.

Has anyone else encountered this problem? and how to control the creation of multiple processes at the same time.

Operating System: Windows
Sensu Go: 6.8

Hey there,

What are the checks doing? What plugins or assets are you using? There are any number of things that can affect CPU utilization, so it’s sort of hard to say what’s happening without knowing more about your environment and the checks you’re running.

Hey @aaronsachs ,

Thanks for the quick reply.

What are the checks doing? What plugins or assets are you using?

I have a few checks which use the same customized asset (Implemented using Python Prometheus Client Parser), that scrape the metrics from Windows Exporter on that machine and each check monitors specific metrics values (Example: CPU, Windows Services).

I know scraping and monitoring metrics is resource intensive task. Can’t we limit the number of process creations?

Hey there,

Is there a reason to use the python client parser? You could use a native powershell command like:

Invoke-WebRequest -UseBasicParsing -Uri http://localhost:8080/metrics | Select-Object -ExpandProperty Content  

To gather the metrics, or there’s the HTTP-GET from the http check collection, which since it’s compiled, may not be as heavy handed as using Python.

In either case, the agent is capable of gathering those metrics and alerting off of certain ones using metric threshold evaluation, which may reduce the amount of checks you need to implement. It may also be worth checking to see if there’s a particular metric that’s taking longer than you’d expect to return a value, because if it’s choking on something, then it’s likely that checks could pile up as a result. I’d be really curious if the logs show anything about check executions still being in progress.

Hi @aaronsachs ,

Is there a reason to use the python client parser?

Yes, Because I have checks that monitor total network packets received from all network interface cards, I must sum all cards. As a result, I need to parse the metrics.
This is just one of my expectations; we also need to check a few services to see if they are running, paused, or stopped.

# HELP windows_net_packets_received_total (Network.PacketsReceivedPerSec)
# TYPE windows_net_packets_received_total counter
windows_net_packets_received_total{nic="Intel_R__Wi_Fi_6_AX205_160MHz"} 934708
windows_net_packets_received_total{nic="Realtek_PCIe_GBX_Family_Controller"} 0
# HELP windows_service_state The state of the service (State)
# TYPE windows_service_state gauge
windows_service_state{name="mssqlserver",state="continue_pending"} 0
windows_service_state{name="mssqlserver",state="pause_pending"} 0
windows_service_state{name="mssqlserver",state="paused"} 0
windows_service_state{name="mssqlserver",state="running"} 1
windows_service_state{name="mssqlserver",state="start_pending"} 0
windows_service_state{name="mssqlserver",state="stop_pending"} 0
windows_service_state{name="mssqlserver",state="stopped"} 0
windows_service_state{name="mssqlserver",state="unknown"} 0

As per my knowledge, I think we can’t handle this using a check.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.