When a large number of requests are time-consuming, they affect each other, causing 499

Question

When a large number of requests are time-consuming, they affect each other, causing 499

funky-eyes opened this issue 5 months ago · comments

When a request has a large number of high rt, some normal requests will not respond normally, resulting in 499. When the relevant high rt application solves the problem of high rt, 499 will disappear
I don't know if it has anything to do with the version of openrestry I use.
openresty/1.13.6.2
centos：7.9.2009
kernel：3.10.0-1160.81.1.e17.x86_64
ulimit openfiles: 1000000

Jun Ouyang · Answer 1 · Tue Jan 02 2024 19:47:07 GMT+0800 (China Standard Time)

HTTP 499 in Nginx means that the client closed the connection before the server answered the request. In my experience is usually caused by client-side timeout. So the problem seems your upstream server does not respond anything when HTTP client timeout occurs.

lijunlong · Answer 2 · Tue Jan 02 2024 20:06:44 GMT+0800 (China Standard Time)

a request has a large number of high rt
What exactly does the high rt mean?
Does it mean the request will block the nginx cycle and nginx cannot serve other requests?
Or does it mean that the upstream of the high rt request response slowly?

If it is the latter, then it is abnormal.
If it is the first, you can try to yield in the high rt request or don't use the blocking api.

funkye · Answer 3 · Wed Jan 03 2024 09:56:25 GMT+0800 (China Standard Time)

I have multiple upstream services. For example, when the qps of my service A reaches 230 and rt30ms, service B will be affected by service A, resulting in 499 rt jitter. When I migrate the A service to another openresty cluster, the problem is solved immediately. The following is the monitoring chart at that time. The cpu, io, and load are not abnormal. The openresty cluster is very healthy.

lijunlong · Answer 4 · Wed Jan 03 2024 22:00:25 GMT+0800 (China Standard Time)

You can use OpenResty XRay to analyze this issue.
Please goto https://xray.openresty.com

funkye · Answer 5 · Fri Jan 05 2024 17:14:17 GMT+0800 (China Standard Time)

I will try to use it to analyze it, thanks for the reply