Hangs when closing an async http client

Question

Hangs when closing an async http client

BMorearty opened this issue 5 years ago · comments

I don't know if I'm doing this wrong, but this code hangs consistently when calling Async::HTTP::Client#close:

require 'async'
require 'async/http'

site = 'https://github.com'
path = '/about'

Async do
  uri = URI.parse(site)
  endpoint = Async::HTTP::Endpoint.new(uri)
  client = Async::HTTP::Client.new(endpoint)

  response = client.get(path)
  body = response.body.read.split("\n")
  puts "Response sample: #{body.grep(/doctype/i)[0]}\nResponse status: #{response.status}"

  # This hangs
  client.close

  puts "Finished closing."
end

Sample run:

If I comment out client.close, the program finishes.

But! If I tried a couple of websites. I randomly tried changing github.com to airbnb.com and changed /about to /how-it-works, and the behavior was different:

close still hung, as with GitHub.
But if I remove the close call, it prints Finished closing and still hangs, never finishing.

Here are the changes to make this happen:

site = 'https://www.airbnb.com'
path = '/how-it-works'

...

# client.close

Samuel Williams · Answer 1 · Mon Dec 02 2019 09:33:25 GMT+0800 (China Standard Time)

Use response.read instead of response.body.read (which only returns one chunk) or call response.close (May terminate underlying connection) or response.finish (will read entire response).

Brian Morearty · Answer 2 · Mon Dec 02 2019 09:55:25 GMT+0800 (China Standard Time)

Ok, switching from response.body.read to response.read fixes it. Thanks.

But if I forget the client.close it still hangs, which surprises me. I know one should always close things that are opened but I expected a memory leak instead of a hang.

Samuel Williams · Answer 3 · Mon Dec 02 2019 10:45:47 GMT+0800 (China Standard Time)

Yeah the way it currently works could be different. It hangs because there is an HTTP/2 background reader

Brian Morearty · Answer 4 · Mon Dec 02 2019 12:59:43 GMT+0800 (China Standard Time)

If it’s possible to make it not hang in this scenario, I think it would be helpful. I could see this causing a lot of confusion over the long term.

Samuel Williams · Answer 5 · Mon Dec 02 2019 13:40:45 GMT+0800 (China Standard Time)

It hangs but it also prints out a log message (waiting for pool to drain).

It's definitely not the same between HTTP/1 (doesn't require background reader) and HTTP/2 (requires background reader). That bothers me slightly.

But it's also tricky. HTTP/2 semantics make it more difficult to use... I've seen people trying to do the following:

Async do |task|
    client = connect
    task.reactor.stop
end

# some other code

Async do |task|
    client.do_some_request
    # etc
end

The problem here is that Reactor#stop doesn't stop all child tasks. I'm not sure that was the right decision on my part. I may rename that method #pause and use Reactor#stop as the same semantics as Task#stop.

The semantics of this hypothetical #pause are useful but tricky. If you want to embed Async::Reactor into another run-loop you need the ability to run it for a short duration, e.g. 10ms or something like that. I'm on the fence as to whether this should be an "allowed" use case, but here is where it gets tricky:

If you have a reactor, and you make, say, HTTP/2 connection, and if you don't service that connection regularly, stuff like ping/pong, send/receive windows, etc might not be updated frequently enough (e.g. more than 1 second). This may cause the remote end to drop the connection or just cause weird issues/latency.

A more basic example would be someone who schedules a timer to run ever 1 second, but only updates the run-loop every 1.5 seconds. Of course it will not work correctly.

So it comes back to the following:

Async do
    # What resources can outlive this block?
end

Right now, tasks (e.g. background readers) can't escape this unless you explicitly call Reactor#stop (which should probably be renamed Reactor#pause). Sockets and other I/O can escape (e.g. connected sockets) but this can cause unexpected behaviour which is why the debug specs flag this as an issue.

What could be good, as a first step, is to ensure we validate and report on these issues very clearly during specs, e.g. async-http could give more thorough warnings:

Waiting for pool to drain. There are 3 outstanding connections that have not been closed.
- Connection 1 POST http://foo/bar started at file.rb:32
- ... etc

I'm a strong advocate for strict, correct and verbose tests.