socketry / async-http

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hangs when closing an async http client

BMorearty opened this issue · comments

I don't know if I'm doing this wrong, but this code hangs consistently when calling Async::HTTP::Client#close:

require 'async'
require 'async/http'

site = 'https://github.com'
path = '/about'

Async do
  uri = URI.parse(site)
  endpoint = Async::HTTP::Endpoint.new(uri)
  client = Async::HTTP::Client.new(endpoint)

  response = client.get(path)
  body = response.body.read.split("\n")
  puts "Response sample: #{body.grep(/doctype/i)[0]}\nResponse status: #{response.status}"

  # This hangs
  client.close

  puts "Finished closing."
end

Sample run:

Screen Shot 2019-12-01 at 2 18 08 PM

If I comment out client.close, the program finishes.

But! If I tried a couple of websites. I randomly tried changing github.com to airbnb.com and changed /about to /how-it-works, and the behavior was different:

  • close still hung, as with GitHub.
  • But if I remove the close call, it prints Finished closing and still hangs, never finishing.

image

Here are the changes to make this happen:

site = 'https://www.airbnb.com'
path = '/how-it-works'

...

# client.close

Use response.read instead of response.body.read (which only returns one chunk) or call response.close (May terminate underlying connection) or response.finish (will read entire response).

Ok, switching from response.body.read to response.read fixes it. Thanks.

But if I forget the client.close it still hangs, which surprises me. I know one should always close things that are opened but I expected a memory leak instead of a hang.

Yeah the way it currently works could be different. It hangs because there is an HTTP/2 background reader

If it’s possible to make it not hang in this scenario, I think it would be helpful. I could see this causing a lot of confusion over the long term.

It hangs but it also prints out a log message (waiting for pool to drain).

It's definitely not the same between HTTP/1 (doesn't require background reader) and HTTP/2 (requires background reader). That bothers me slightly.

But it's also tricky. HTTP/2 semantics make it more difficult to use... I've seen people trying to do the following:

Async do |task|
    client = connect
    task.reactor.stop
end

# some other code

Async do |task|
    client.do_some_request
    # etc
end

The problem here is that Reactor#stop doesn't stop all child tasks. I'm not sure that was the right decision on my part. I may rename that method #pause and use Reactor#stop as the same semantics as Task#stop.

The semantics of this hypothetical #pause are useful but tricky. If you want to embed Async::Reactor into another run-loop you need the ability to run it for a short duration, e.g. 10ms or something like that. I'm on the fence as to whether this should be an "allowed" use case, but here is where it gets tricky:

If you have a reactor, and you make, say, HTTP/2 connection, and if you don't service that connection regularly, stuff like ping/pong, send/receive windows, etc might not be updated frequently enough (e.g. more than 1 second). This may cause the remote end to drop the connection or just cause weird issues/latency.

A more basic example would be someone who schedules a timer to run ever 1 second, but only updates the run-loop every 1.5 seconds. Of course it will not work correctly.

So it comes back to the following:

Async do
    # What resources can outlive this block?
end

Right now, tasks (e.g. background readers) can't escape this unless you explicitly call Reactor#stop (which should probably be renamed Reactor#pause). Sockets and other I/O can escape (e.g. connected sockets) but this can cause unexpected behaviour which is why the debug specs flag this as an issue.

What could be good, as a first step, is to ensure we validate and report on these issues very clearly during specs, e.g. async-http could give more thorough warnings:

Waiting for pool to drain. There are 3 outstanding connections that have not been closed.
- Connection 1 POST http://foo/bar started at file.rb:32
- ... etc

I'm a strong advocate for strict, correct and verbose tests.