pgjones / hypercorn

Hypercorn is an ASGI and WSGI Server based on Hyper libraries and inspired by Gunicorn.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

FastAPI deployed with hypercorn in GCP Cloud Run returning 503 sporadically

bgregoinductiva opened this issue · comments

I have a FastAPI project deployed in Cloud Run using the hypercorn server. I'm using Uvloop as the event loop and leaving the other configurations with default values:

hypercorn app.main:app --bind 0.0.0.0:80 --worker-class uvloop

Here are the Cloud Run configurations:

  • Memory: 1 GiB
  • CPU: 1
  • Maximum concurrent requests per instance: 80
  • CPU is only allocated during request processing
  • Minimum number of instances: 1
  • Maximum number of instances: 30
  • Startup CPU boost
  • Use HTTP/2 end-to-end

When I get a peak of concurrent requests during integration testing, about 30, I usually get a 503, and then a new instance is started.

Has anyone faced a similar problem before?

Thanks in advance.

Yes, based on what I have learnt so far, your instance was terminated because it accessed more memory that its defined limit.

Even though this says that Cloud Run will return a 500. In my testing I was able to prove the it actually returns a 503. Their documentation leaves a lot to be desired.

Hope this helps.

We have the same issue, only at 40% memory usage at 99 percentile.

Update: we isolated the issue to only HTTP/2. HTTP/1 seems to be fine.

Update: we isolated the issue to only HTTP/2. HTTP/1 seems to be fine.

Is the HTTP/1 traffic encrypted? There seems to be an asyncio memory leak with SSL

Update: we isolated the issue to only HTTP/2. HTTP/1 seems to be fine.

Is the HTTP/1 traffic encrypted? There seems to be an asyncio memory leak with SSL

Cloudrun terminates TLS.
https://cloud.google.com/run/docs/container-contract#tls

Also, I hate to admit this in public, but I wasn't closing SQL connections in the health check endpoint so that was leaking file descriptors. This was causing our Cloud Run containers to crash without log events returning a 503 from the Cloud Run LB.

So another thing to check would be your file descriptor count.