Keep alive connection do not get closed with server.close()

Question

Keep alive connection do not get closed with server.close()

codygustafson opened this issue 9 years ago · comments

The following will remain open as long as the client makes a request within one second intervals making it possible that a server will never gracefully close. With the default of 2 minute timeouts a user or api client would need to stop traffic for two minutes before this will take affect. I propose closing sockets using keep alive immediately after the next request. This will prevent needing to keep track of everything that would need to be closed while making the maximum time to gracefully shutdown fixed(and finite).

var http = require('http');

var server = http.createServer(function(req, res) {
  res.end('test\n');
  server.close()
}).listen(8000, '127.0.0.1');
server.setTimeout(1000);

Brendan Ashworth · Answer 1 · Wed Sep 02 2015 14:29:36 GMT+0800 (China Standard Time)

Thanks for opening an issue. I think a similar solution to this issue is being pursued in #2534.

Gil Pedersen · Answer 2 · Wed Sep 02 2015 16:37:29 GMT+0800 (China Standard Time)

Nah, #2534 is about tcp keep alive, while this issue is a http keep alive issue. #2534 is a partial fix for this issue for non-malicious clients, and still imposes an unnecessary timeout delay.

Timur Shemsedinov · Answer 3 · Sat Sep 05 2015 16:07:30 GMT+0800 (China Standard Time)

@kanongil #2534 is about HTTP and HTTPS keep-alive, as indicated in pull-request title http: ... and label. I added separate keepAliveTimeout (default to 5 seconds), but if you want to destroy sockets immediately on server.close() (not waiting for 5 seconds) we need to track each socket (adding it to array and removing after destroy). I do not like this overhead, or you have better solution?

Ron Korving · Answer 4 · Sat Sep 05 2015 17:18:13 GMT+0800 (China Standard Time)

Rather than an array, you should probably consider a WeakMap now. It is overhead, but that's still better than a broken feature.

Gil Pedersen · Answer 5 · Sat Sep 05 2015 17:56:22 GMT+0800 (China Standard Time)

@tshemsedinov Right you are, though the partial fix part still stands.

The main problem with the tcp timeout fix is that malicious clients (intentional or otherwise) can keep the connection alive by creating new requests on the socket. Eg. when polling for a value every 2 seconds.

Cody Gustafson · Answer 6 · Mon Sep 07 2015 15:46:57 GMT+0800 (China Standard Time)

Just thought I would mention again that the main issue here is that it is possible that sockets never get destroyed. To prevent the overhead @tshemsedinov is talking about, I still think it would be best to do what I mentioned in the description. Which is destroying sockets on the next request to prevent refreshing the timeout for that socket. I think everyone can handle if all sockets are guaranteed to be destroyed within the timeout period.

Ron Korving · Answer 7 · Mon Sep 07 2015 15:51:06 GMT+0800 (China Standard Time)

Depends what the timeout period is :) It's been several minutes up till now, right? That's too long for my process to shutdown.

Cody Gustafson · Answer 8 · Tue Sep 08 2015 01:00:18 GMT+0800 (China Standard Time)

Good point.

keepAliveMsecs: {Integer} When using HTTP KeepAlive, how often to send TCP KeepAlive packets over sockets being kept alive. Default = 1000. Only relevant if keepAlive is set to true.

I think this callback would be another good place to karate chop these sockets. That should allow all sockets to be destroyed within a second with no extra overhead.

Tobias Petry · Answer 9 · Sun Oct 04 2015 23:53:12 GMT+0800 (China Standard Time)

This should be fixed or noted in the documentation for server.close([callback]). I have been searching this bug for hours (in a complex application) to research why the http server closes the connection minutes after i requested it to close with no active http request.

Ben Noordhuis · Answer 10 · Mon Oct 05 2015 00:06:34 GMT+0800 (China Standard Time)

The documentation seems pretty clear to me - "Stops the server from accepting new connections." - but if you think it can be improved, please file a PR with your suggested changes. Do consult CONTRIBUTING.md first, though.

Viper Bailey · Answer 11 · Mon Oct 05 2015 05:30:24 GMT+0800 (China Standard Time)

The clarity of the documentation is not the problem. The problem is the fact that this implementation of the http server as it now stands prevents a person from gracefully shutting down an HTTP server.

Ben Noordhuis · Answer 12 · Mon Oct 05 2015 06:00:22 GMT+0800 (China Standard Time)

Define 'gracefully'? If you mean 'forcibly close client connections', then no, it doesn't do that, and that's deliberate.

OP's suggested change is a no go because it makes it impossible to keep existing connections open indefinitely (which is a use case that should be supported), whereas force-closing can easily be implemented on top of the current behavior - just maintain a list of open connections. There is probably already a npm module for that.

Gil Pedersen · Answer 13 · Mon Oct 05 2015 06:53:52 GMT+0800 (China Standard Time)

force-closing can easily be implemented on top of the current behavior - just maintain a list of open connections.

While tracking the connections is somewhat simple, actually determining the state of these, and acting on it, is not. As far as I am concerned, a graceful server close should:

Stop accepting new connections (existing behavior).
Immediately close any completely idle connections. Eg. connections with no incoming our outgoing messages pending.
Stop accepting new requests on all remaining connections, and closing these once the outgoing queue is drained.

For 2., the action is simple (a close on the socket) but detecting when to apply it seems tricky.

The major pain point is 3., which I don't think is possible to solve using public APIs, and maybe not even using private ones, as some of the state is captured in a closure scope.

Viper Bailey · Answer 14 · Wed Oct 07 2015 01:29:08 GMT+0800 (China Standard Time)

If anyone is interested, I coded up a function that will enable graceful closing on an http server. Heads up that it's written in TypeScript, but it shouldn't be too hard to understand. Hopefully this is helpful to others that have been wrestling with this.

https://gist.github.com/jinxidoru/0611100d1d12ecddfa04

David Murdoch · Answer 15 · Wed Jan 06 2016 07:00:34 GMT+0800 (China Standard Time)

Spent quite a bit of time debugging this issue today. The Internet seems to believe this is how server.close() currently works.

Sam Roberts · Answer 16 · Wed Jan 06 2016 12:47:03 GMT+0800 (China Standard Time)

@bnoordhuis I don't think the algorithm described in #2642 (comment) should be the default for close, but I think it should be achievable based on reading the docs. I've poked idly at this a few times without success. One thing with it is that it is not in fact possible to do a "read close" in TCP, only a "write close", so while its possible to do a write close after replying, without somehow corking/discarding incoming data or requests after the currently being read (if any) request is completely read, its not really possible to do. I think we need to trap the request event, check some "closed" state, and then just ignore the request, and somehow get an event for when any currently pending responses are drained. Maybe that's possible with the http API, but its not obvious to me how. Could be just a doc problem, could be a missing feature.

@jinxidoru typescript decreases the readability of your example. everyone here reads javascript, not so typescript. If I understand your code, it destroys underlying connections whether they are in the process of handling a request or not, which isn't exactly graceful.

David Murdoch · Answer 17 · Thu Jan 07 2016 00:56:33 GMT+0800 (China Standard Time)

Just for context on why I (think) I need this close behavior:

I've got an API running on heroku that utilizes keep alive. Heroku, at any time, may send a SIGTERM to my app, asking it to gracefully shutdown. If it does not shutdown within 10 seconds, a SIGKILL follows.

My API reacts to SIGTERM by cleaning up my timers and database connections, calling server.close (disallowing any new connections), as well as catching any requests over existing connections (from KeepAlive) and sending a 503 response with a Connection: close. Sadly, the process will remain alive for the full 10 seconds if there were any existing connections that haven't sent a new request... meaning my API will be unavailable for a full 10 seconds, every time. :-/

What I'm thinking about doing is just calling process.exit() once misc cleanup has finished, regardless of if HTTP connections have closed --- risking closing connections while in the middle of accepting/sending data.

Ben Noordhuis · Answer 18 · Thu Jan 07 2016 01:06:43 GMT+0800 (China Standard Time)

What I'm thinking about doing is just calling process.exit() once misc cleanup has finished, regardless of if HTTP connections have closed --- risking closing connections while in the middle of accepting/sending data.

If you have a hard deadline, that's what effectively happens anyway. You (i.e. node) may have written the response to the socket but that doesn't mean the kernel has sent it or that the remote end has received it. They may be on a slow or congested uplink.

Michal Bryc · Answer 19 · Thu Jan 07 2016 01:56:48 GMT+0800 (China Standard Time)

If you have a hard deadline, that's what effectively happens anyway. You (i.e. node) may have written the response to the socket but that doesn't mean the kernel has sent it or that the remote end has received it. They may be on a slow or congested uplink.

Networks are unreliable, sure. That's why we have TCP. But using the network (or kernel) as an excuse for poor behavior elsewhere only compounds the problem. Just because the message might not get sent doesn't mean we shouldn't try to gracefully shut down incoming connections.

Hunter Loftis · Answer 20 · Sat Jan 09 2016 01:16:33 GMT+0800 (China Standard Time)

@davidmurdoch based on your description, I think you're solving the issue well - and also, your app shouldn't be subject to a 10-second downtime during Heroku's restart. The downtime will be however long it takes your app to start (which for node apps tends to be about 1 second):

https://devcenter.heroku.com/changelog-items/354

Additionally, during that time your app won't be fully down - instead, Heroku's router will queue requests which will be sent to the new app instance as soon as it listens to PORT.

James M Snell · Answer 21 · Sat Jan 09 2016 01:48:47 GMT+0800 (China Standard Time)

@nodejs/http

mscdex · Answer 22 · Sat Jan 09 2016 01:58:33 GMT+0800 (China Standard Time)

I wouldn't mind seeing a boolean flag or something that can be passed to server.close() to additionally allow closing of idle keepalive connections (if that's feasible/possible).

Ben Noordhuis · Answer 23 · Sat Jan 09 2016 03:41:37 GMT+0800 (China Standard Time)

Only if the server instance tracks socket objects. It's not unthinkable but it seems wasteful when the common case is that it will be unused. Dealing with sockets/servers from the cluster module is also an open problem.

In theory we could walk open persistent handles and filter on socket handles belonging to a particular server instance but that is pretty horrid and inefficient.

David Murdoch · Answer 24 · Sat Jan 09 2016 06:35:12 GMT+0800 (China Standard Time)

@hunterloftis Oh wow, that's awesome! I had ended up deciding to disable keep-alive altogether, so I'm very happy you told me this. Thanks!

Ron Korving · Answer 25 · Tue Jan 12 2016 10:22:26 GMT+0800 (China Standard Time)

@bnoordhuis I was thinking about the same solution. Seems pretty nasty, but most people are probably fine with close() not being 100% efficient. I'm guessing they will be less fine with typical runtime of an HTTP server slowing down because of the added bookkeeping. Nasty, but if it works...

Bent Cardan · Answer 26 · Tue Jan 12 2016 11:17:48 GMT+0800 (China Standard Time)

when we open these things in libuv we should definitely be doing setsockopt(2) with SO_LINGER

Jeff Lewis · Answer 27 · Sat Feb 20 2016 01:27:06 GMT+0800 (China Standard Time)

FYI - Here's a simple implementation of tracking & closing keep-alive connections https://github.com/thedillonb/http-shutdown

Brandon Zacharie · Answer 28 · Mon May 23 2016 21:08:08 GMT+0800 (China Standard Time)

@isaacs has a super simple module for this that I use: https://github.com/isaacs/server-destroy

Gil Pedersen · Answer 29 · Mon May 23 2016 22:24:15 GMT+0800 (China Standard Time)

@BrandonZacharie That module destroys everything immediately, including active connections, and does nothing to preserve the graceful shutdown mechanics of the close() method.

Jonathan Lima · Answer 30 · Fri Jul 01 2016 02:50:44 GMT+0800 (China Standard Time)

Just a quick note, the problem is even worse when using the cluster module.

As disconnect is called in cluster.js it waits til all servers close. When using keep alive connections(e.g.: AWS Load Balancer), this never occurs as although the server socket is closed, keep alive client sockets never are.

This causes our workers to never restart using a cluster process manager like PM2.

Sergio Garcia Mondaray · Answer 31 · Wed Dec 14 2016 18:56:44 GMT+0800 (China Standard Time)

My simple solution is to destroy sockets on request finish:

server.on('request', (request, response) => {
	response.on('finish', () => request.socket.destroy());
});

This way, when you call server.close() only real in-progress requests are considered before closing.

Danny Yates · Answer 32 · Wed Dec 14 2016 19:03:22 GMT+0800 (China Standard Time)

The problem with that is that it denies clients the ability to use HTTP keep-alives to minimise latency when requesting multiple resources from your server.

Danny Yates · Answer 33 · Wed Dec 14 2016 19:04:49 GMT+0800 (China Standard Time)

Actually, it's even worse than that. Because a client issuing an HTTP keep alive request won't be expecting the socket to close, it will go ahead and attempt to use its socket to talk to your server and won't realised until after the TCP timeout that the connection is dead.

Hunter Loftis · Answer 34 · Sat May 20 2017 07:14:51 GMT+0800 (China Standard Time)

I just published a small, simple, heavily tested module to solve this in userland:

https://github.com/hunterloftis/stoppable

const server = stoppable(http.createServer(handler))
server.stop()

does the bookkeeping on connection and req/res. There's no way around this.
stops accepting new reqs immediately
waits for in-flight requests up to a grace period

Ben Noordhuis · Answer 35 · Sat May 20 2017 17:46:20 GMT+0800 (China Standard Time)

I'll go ahead and close this out. server.close() is working as documented and as #2642 (comment) demonstrates, it's extensible enough to make it work for different use cases.

Seth Miller · Answer 36 · Tue Jul 04 2017 05:34:12 GMT+0800 (China Standard Time)

@hunterloftis - That little stoppable module looks great. Excellent motivation documentation, unobtrusive and consistent with current apis, respectful of TCP by sending a FIN packet, love it.

Ankur Oberoi · Answer 37 · Sat Feb 03 2018 02:49:18 GMT+0800 (China Standard Time)

which node release is this behavior fixed in?

Ron Korving · Answer 38 · Mon Feb 05 2018 12:19:13 GMT+0800 (China Standard Time)

@aoberoi
Unfortunately, it is not fixed (some argue that it's working as intended). I would suggest having a look at the excellent stoppable module.

Lucas Barrena · Answer 39 · Tue Jul 16 2019 10:14:56 GMT+0800 (China Standard Time)

stoppable seems very inactive and not enough for me, also has issues without reply, so I ended with:
https://github.com/LuKks/like-server
Would be great if we can test it more.

Adrien Foulon · Answer 40 · Sat Jan 21 2023 21:40:01 GMT+0800 (China Standard Time)

This is insane.. Also spent 8 hours trying to understand why a server close would take 5 seconds, well that's because of this because the keepAliveTimeout defaults to 5000

Even server.closeAllConnections(); doesn't force close them, maybe this needs a force argument, otherwise this really shouldn't be a closed issue

Ben Noordhuis · Answer 41 · Sun Jan 22 2023 19:51:03 GMT+0800 (China Standard Time)

@Tofandel Sounds implausible, it basically calls socket.destroy() on all connections it knows about. Please put together a small test case and open a new issue if you're sure that is what's happening.

Lucas Barrena · Answer 42 · Sun Jan 22 2023 20:14:52 GMT+0800 (China Standard Time)

Consider this module: https://github.com/LuKks/graceful-http

I based the library on this (plus on real usage):
https://blog.dashlane.com/implementing-nodejs-http-graceful-shutdown/
That blog post explains the problem and solution perfectly!

Halil Kayer · Answer 43 · Mon Feb 13 2023 15:18:48 GMT+0800 (China Standard Time)

My simple solution is to destroy sockets on request finish:
server.on('request', (request, response) => {
	response.on('finish', () => request.socket.destroy());
});
This way, when you call server.close() only real in-progress requests are considered before closing.

I am on node.js 16.15.0 and by using your suggestion, able to invoke close callback properly

server.close((err) => {
  if (err) {
    console.error(err);
  }
  
  console.info("http server closed successfully. Exiting!");
});

best part is I don't have to use process.exit(0) so other processes continue working and won't cluttered by explicit process.exit

Seth Miller · Answer 44 · Tue Feb 14 2023 04:22:30 GMT+0800 (China Standard Time)

After seeing node still get the basics wrong. I'm glad I left the node world years ago. The fragile castles of sand are real in this realm. Godspeed everyone stuck here.