python-trio / hip

A new Python HTTP client for everybody

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

review connection lifecycle

njsmith opened this issue · comments

The intended lifecycle of a urllib3 connection involves being set up, put into a connection pool, taken out of the connection pool, used, put back in the connection pool, etc. Some of these steps are tricky:

  • if something goes wrong when setting up the connection (e.g., can't get the socket to connect, TLS handshake error) then it should never be put into the pool
  • when taking it out of the pool, it might have been closed by the server, so we have to check if it's still usable. (Generally, "if the socket is readable, it's not usable" is the heuristic we want to use, since it indicates that the server has sent some sort of 'go away' message or just closed the socket; making this work with Twisted is non-trivial.)
  • we should only put it back into the pool if it's in a clean and ready to re-use state. (If our caller only reads half the response and then stops, we can't re-use that connection.)

Well, the connection may be usable when we get it out of the pool and the socket is not readable, but that's not a guarantee. I once had a cute race condition where the server timed out after 60 seconds … and the client sent a request every minute … so the new request and the server's FIN crossed each other.

Meaning, "send a request and immediately get EOF" is a hard error on a new connection but the request must be retried when the connection is from the pool.

I don't think there's any general solution for that race condition, but we can certainly keep whatever strategy urllib3 is currently using.

(in general when you get a blank eof after sending a request, you don't know whether it was processed by the server or not, and if the request isn't idempotent then automatically retrying it could be dangerous.)