Websocket Connection Error 1006 (abnormal closure)

On my kubernetes daemonset agents as well as on the agents that are installed directly on VMs I am seeing very often the following error in the logs. I don’t see any issues in on the backend for the related entities. Any idea what could be wrong?

Mai 29 10:28:33 ceph-admin.de sensu-agent[44663]: {"component":"agent","error":"Connection error: websocket: close 1006 (abnormal closure): unexpected EOF","level":"error","msg":"transport receive error","time":"2020-05-29T10:28:33+02:00"}
Mai 29 10:29:03 ceph-admin.de sensu-agent[44663]: {"component":"agent","error":"Connection error: websocket: close 1006 (abnormal closure): unexpected EOF","level":"error","msg":"transport receive error","time":"2020-05-29T10:29:03+02:00"}
Mai 29 10:29:33 ceph-admin.de sensu-agent[44663]: {"component":"agent","error":"Connection error: websocket: close 1006 (abnormal closure): unexpected EOF","level":"error","msg":"transport receive error","time":"2020-05-29T10:29:33+02:00"}
Mai 29 10:30:03 ceph-admin.de sensu-agent[44663]: {"component":"agent","error":"Connection error: websocket: close 1006 (abnormal closure): unexpected EOF","level":"error","msg":"transport receive error","time":"2020-05-29T10:30:03+02:00"}
Mai 29 10:30:33 ceph-admin.de sensu-agent[44663]: {"component":"agent","error":"Connection error: websocket: close 1006 (abnormal closure): unexpected EOF","level":"error","msg":"transport receive error","time":"2020-05-29T10:30:33+02:00"}
Mai 29 10:31:03 ceph-admin.de sensu-agent[44663]: {"component":"agent","error":"Connection error: websocket: close 1006 (abnormal closure): unexpected EOF","level":"error","msg":"transport receive error","time":"2020-05-29T10:31:03+02:00"}
Mai 29 10:31:33 ceph-admin.de sensu-agent[44663]: {"component":"agent","error":"Connection error: websocket: close 1006 (abnormal closure): unexpected EOF","level":"error","msg":"transport receive error","time":"2020-05-29T10:31:33+02:00"}
Mai 29 10:32:03 ceph-admin.de sensu-agent[44663]: {"component":"agent","error":"Connection error: websocket: close 1006 (abnormal closure): unexpected EOF","level":"error","msg":"transport receive error","time":"2020-05-29T10:32:03+02:00"}
Mai 29 10:32:34 ceph-admin.de sensu-agent[44663]: {"component":"agent","error":"Connection error: websocket: close 1006 (abnormal closure): unexpected EOF","level":"error","msg":"transport receive error","time":"2020-05-29T10:32:34+02:00"}
Mai 29 10:33:04 ceph-admin.de sensu-agent[44663]: {"component":"agent","error":"Connection error: websocket: close 1006 (abnormal closure): unexpected EOF","level":"error","msg":"transport receive error","time":"2020-05-29T10:33:04+02:00"}

That’s interesting, it looks like the connection is being severed exactly every 30s. It’s typical for these connections to be very long-lived. I don’t know why this would be happening in your environment.

1 Like

Hmm i wonder how the ingress is configured in your cluster. I know from playing with nginx in a more traditional web application setup that there are tunables in the nginx config meant for web servers like the keepalive connection tunables that may play havoc with websockets and cause them to close.

I’d be very suspicious its something in your ingress controller that is suboptimal for the websockets. Sensu uses a long lived websocket between agent and backend, so if you have any tunables like nginx’s keepalive_timeout configured (that’s a guess on my part) your ingress controller could be killing connections on a regular cycle like you are describing.

@eric you were right.
I am currently using a Google Load Balancer. This one by default sets a timeout for 30s which also counts for websocket connections.
https://cloud.google.com/load-balancing/docs/https#websocket_proxy_support

Increasing that to 3600s will make the issue appear once every hour. Do you guys have a recommendation to what I should set it? Will this connection ever be closed and re-opened?

Agents will attempt to reconnect when their connection is severed, so you can probably set this number to whatever you like without issue. It shouldn’t have too much of an effect on check execution.