Hangs when closing an async http client
BMorearty opened this issue · comments
I don't know if I'm doing this wrong, but this code hangs consistently when calling Async::HTTP::Client#close
:
require 'async'
require 'async/http'
site = 'https://github.com'
path = '/about'
Async do
uri = URI.parse(site)
endpoint = Async::HTTP::Endpoint.new(uri)
client = Async::HTTP::Client.new(endpoint)
response = client.get(path)
body = response.body.read.split("\n")
puts "Response sample: #{body.grep(/doctype/i)[0]}\nResponse status: #{response.status}"
# This hangs
client.close
puts "Finished closing."
end
Sample run:
If I comment out client.close
, the program finishes.
But! If I tried a couple of websites. I randomly tried changing github.com
to airbnb.com
and changed /about
to /how-it-works
, and the behavior was different:
close
still hung, as with GitHub.- But if I remove the
close
call, it printsFinished closing
and still hangs, never finishing.
Here are the changes to make this happen:
site = 'https://www.airbnb.com'
path = '/how-it-works'
...
# client.close
Use response.read
instead of response.body.read
(which only returns one chunk) or call response.close
(May terminate underlying connection) or response.finish
(will read entire response).
Ok, switching from response.body.read
to response.read
fixes it. Thanks.
But if I forget the client.close
it still hangs, which surprises me. I know one should always close things that are opened but I expected a memory leak instead of a hang.
Yeah the way it currently works could be different. It hangs because there is an HTTP/2 background reader
If it’s possible to make it not hang in this scenario, I think it would be helpful. I could see this causing a lot of confusion over the long term.
It hangs but it also prints out a log message (waiting for pool to drain).
It's definitely not the same between HTTP/1 (doesn't require background reader) and HTTP/2 (requires background reader). That bothers me slightly.
But it's also tricky. HTTP/2 semantics make it more difficult to use... I've seen people trying to do the following:
Async do |task|
client = connect
task.reactor.stop
end
# some other code
Async do |task|
client.do_some_request
# etc
end
The problem here is that Reactor#stop doesn't stop all child tasks. I'm not sure that was the right decision on my part. I may rename that method #pause
and use Reactor#stop
as the same semantics as Task#stop
.
The semantics of this hypothetical #pause
are useful but tricky. If you want to embed Async::Reactor
into another run-loop you need the ability to run it for a short duration, e.g. 10ms or something like that. I'm on the fence as to whether this should be an "allowed" use case, but here is where it gets tricky:
If you have a reactor, and you make, say, HTTP/2 connection, and if you don't service that connection regularly, stuff like ping/pong, send/receive windows, etc might not be updated frequently enough (e.g. more than 1 second). This may cause the remote end to drop the connection or just cause weird issues/latency.
A more basic example would be someone who schedules a timer to run ever 1 second, but only updates the run-loop every 1.5 seconds. Of course it will not work correctly.
So it comes back to the following:
Async do
# What resources can outlive this block?
end
Right now, tasks (e.g. background readers) can't escape this unless you explicitly call Reactor#stop
(which should probably be renamed Reactor#pause
). Sockets and other I/O can escape (e.g. connected sockets) but this can cause unexpected behaviour which is why the debug specs flag this as an issue.
What could be good, as a first step, is to ensure we validate and report on these issues very clearly during specs, e.g. async-http could give more thorough warnings:
Waiting for pool to drain. There are 3 outstanding connections that have not been closed.
- Connection 1 POST http://foo/bar started at file.rb:32
- ... etc
I'm a strong advocate for strict, correct and verbose tests.