benoitc / gunicorn

gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.

Home Page:http://www.gunicorn.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

404 errors during --max-requests restarts

mikeckennedy opened this issue · comments

Hi all. Thanks for the awesome project. I've started using it for https://talkpython.fm and it's been solid.

However, I recently started playing with trying to controlling the memory more carefully on the server and part of that is to set the max_requests & max_requests_jitter to 1,000 and 200 respectively.

To expedite the experimenting here, I'm using the excellent https://locust.io which hammers the server at a controlled rate. I hit it with around 7,500 requests and the site worked fine as expected if there is no restarts.

However, when adding those restart features, I started getting very rare 404s while gunicorn was restarting the workers.

My expectation was that it would go:

  1. Exceed request number
  2. Launch another worker process
  3. Wait for process ready
  4. Start redirecting new requests away from vanishing worker and towards the new workers
  5. Wait for tired worker to finish or time out requests
  6. Shut down tired worker entirely

But it seems there is some intermediate time while the old worker is vanishing and the new one is not ready. Is that possible?

Again, running without max_request and we have zero 404s. With it, we get a couple every time the restart triggers.

In the picture below, you can see the errors appear right when the restart happens (visible because the reasonable increased latency for a moment).

errors

It also seems the the jitter is ignored. The requests basically go to zero, even though there is a solid amount of jitter between each worker.

Certain about the status code? 50X is to be expected in some configuration, not 404. If you have a 404 HTTP response, show its contents and headers.

Please share config (which worker are you using?) to clarify how closely related your problem is with known issues such as #3038

Thank you @pajod Yes, the 404s are confusing. There is something weird with my dockerized nginx <-> dockerized app on gunicorn that makes it report 404 when the backend is down. The really should be 502s but this is something new that just started happening (404 swapped for 502) and I haven't had time to address it. More germane here is that it's not relevant whether its' 404 or 502 but that requests fail if max-requests is ever set.

This seems to be a duplicate of the open issue #3038 so please put your attention there and I'll close this as a duplicate. It does seem to be exactly the same issue.

And FWIW, I believe the reason I need restarting at all is due to a memory leak in a third party library which is super frustrating but also really hard to control. My research there continues. If I'm lucky, I won't need this feature at all but having gunicorn reliably create failed requests when it is used correctly is worth reporting. :)