nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨

Home Page:https://nodejs.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Keep alive connection do not get closed with server.close()

codygustafson opened this issue · comments

The following will remain open as long as the client makes a request within one second intervals making it possible that a server will never gracefully close. With the default of 2 minute timeouts a user or api client would need to stop traffic for two minutes before this will take affect. I propose closing sockets using keep alive immediately after the next request. This will prevent needing to keep track of everything that would need to be closed while making the maximum time to gracefully shutdown fixed(and finite).

var http = require('http');

var server = http.createServer(function(req, res) {
  res.end('test\n');
  server.close()
}).listen(8000, '127.0.0.1');
server.setTimeout(1000);

Thanks for opening an issue. I think a similar solution to this issue is being pursued in #2534.

Nah, #2534 is about tcp keep alive, while this issue is a http keep alive issue. #2534 is a partial fix for this issue for non-malicious clients, and still imposes an unnecessary timeout delay.

@kanongil #2534 is about HTTP and HTTPS keep-alive, as indicated in pull-request title http: ... and label. I added separate keepAliveTimeout (default to 5 seconds), but if you want to destroy sockets immediately on server.close() (not waiting for 5 seconds) we need to track each socket (adding it to array and removing after destroy). I do not like this overhead, or you have better solution?

Rather than an array, you should probably consider a WeakMap now. It is overhead, but that's still better than a broken feature.

@tshemsedinov Right you are, though the partial fix part still stands.

The main problem with the tcp timeout fix is that malicious clients (intentional or otherwise) can keep the connection alive by creating new requests on the socket. Eg. when polling for a value every 2 seconds.

Just thought I would mention again that the main issue here is that it is possible that sockets never get destroyed. To prevent the overhead @tshemsedinov is talking about, I still think it would be best to do what I mentioned in the description. Which is destroying sockets on the next request to prevent refreshing the timeout for that socket. I think everyone can handle if all sockets are guaranteed to be destroyed within the timeout period.

Depends what the timeout period is :) It's been several minutes up till now, right? That's too long for my process to shutdown.

Good point.

keepAliveMsecs: {Integer} When using HTTP KeepAlive, how often to send TCP KeepAlive packets over sockets being kept alive. Default = 1000. Only relevant if keepAlive is set to true.

I think this callback would be another good place to karate chop these sockets. That should allow all sockets to be destroyed within a second with no extra overhead.

This should be fixed or noted in the documentation for server.close([callback]). I have been searching this bug for hours (in a complex application) to research why the http server closes the connection minutes after i requested it to close with no active http request.

The documentation seems pretty clear to me - "Stops the server from accepting new connections." - but if you think it can be improved, please file a PR with your suggested changes. Do consult CONTRIBUTING.md first, though.

The clarity of the documentation is not the problem. The problem is the fact that this implementation of the http server as it now stands prevents a person from gracefully shutting down an HTTP server.

Define 'gracefully'? If you mean 'forcibly close client connections', then no, it doesn't do that, and that's deliberate.

OP's suggested change is a no go because it makes it impossible to keep existing connections open indefinitely (which is a use case that should be supported), whereas force-closing can easily be implemented on top of the current behavior - just maintain a list of open connections. There is probably already a npm module for that.

force-closing can easily be implemented on top of the current behavior - just maintain a list of open connections.

While tracking the connections is somewhat simple, actually determining the state of these, and acting on it, is not. As far as I am concerned, a graceful server close should:

  1. Stop accepting new connections (existing behavior).
  2. Immediately close any completely idle connections. Eg. connections with no incoming our outgoing messages pending.
  3. Stop accepting new requests on all remaining connections, and closing these once the outgoing queue is drained.

For 2., the action is simple (a close on the socket) but detecting when to apply it seems tricky.

The major pain point is 3., which I don't think is possible to solve using public APIs, and maybe not even using private ones, as some of the state is captured in a closure scope.

If anyone is interested, I coded up a function that will enable graceful closing on an http server. Heads up that it's written in TypeScript, but it shouldn't be too hard to understand. Hopefully this is helpful to others that have been wrestling with this.

https://gist.github.com/jinxidoru/0611100d1d12ecddfa04

Spent quite a bit of time debugging this issue today. The Internet seems to believe this is how server.close() currently works.

@bnoordhuis I don't think the algorithm described in #2642 (comment) should be the default for close, but I think it should be achievable based on reading the docs. I've poked idly at this a few times without success. One thing with it is that it is not in fact possible to do a "read close" in TCP, only a "write close", so while its possible to do a write close after replying, without somehow corking/discarding incoming data or requests after the currently being read (if any) request is completely read, its not really possible to do. I think we need to trap the request event, check some "closed" state, and then just ignore the request, and somehow get an event for when any currently pending responses are drained. Maybe that's possible with the http API, but its not obvious to me how. Could be just a doc problem, could be a missing feature.

@jinxidoru typescript decreases the readability of your example. everyone here reads javascript, not so typescript. If I understand your code, it destroys underlying connections whether they are in the process of handling a request or not, which isn't exactly graceful.

Just for context on why I (think) I need this close behavior:

I've got an API running on heroku that utilizes keep alive. Heroku, at any time, may send a SIGTERM to my app, asking it to gracefully shutdown. If it does not shutdown within 10 seconds, a SIGKILL follows.

My API reacts to SIGTERM by cleaning up my timers and database connections, calling server.close (disallowing any new connections), as well as catching any requests over existing connections (from KeepAlive) and sending a 503 response with a Connection: close. Sadly, the process will remain alive for the full 10 seconds if there were any existing connections that haven't sent a new request... meaning my API will be unavailable for a full 10 seconds, every time. :-/

What I'm thinking about doing is just calling process.exit() once misc cleanup has finished, regardless of if HTTP connections have closed --- risking closing connections while in the middle of accepting/sending data.

What I'm thinking about doing is just calling process.exit() once misc cleanup has finished, regardless of if HTTP connections have closed --- risking closing connections while in the middle of accepting/sending data.

If you have a hard deadline, that's what effectively happens anyway. You (i.e. node) may have written the response to the socket but that doesn't mean the kernel has sent it or that the remote end has received it. They may be on a slow or congested uplink.

If you have a hard deadline, that's what effectively happens anyway. You (i.e. node) may have written the response to the socket but that doesn't mean the kernel has sent it or that the remote end has received it. They may be on a slow or congested uplink.

Networks are unreliable, sure. That's why we have TCP. But using the network (or kernel) as an excuse for poor behavior elsewhere only compounds the problem. Just because the message might not get sent doesn't mean we shouldn't try to gracefully shut down incoming connections.

@davidmurdoch based on your description, I think you're solving the issue well - and also, your app shouldn't be subject to a 10-second downtime during Heroku's restart. The downtime will be however long it takes your app to start (which for node apps tends to be about 1 second):

Additionally, during that time your app won't be fully down - instead, Heroku's router will queue requests which will be sent to the new app instance as soon as it listens to PORT.

@nodejs/http

I wouldn't mind seeing a boolean flag or something that can be passed to server.close() to additionally allow closing of idle keepalive connections (if that's feasible/possible).

Only if the server instance tracks socket objects. It's not unthinkable but it seems wasteful when the common case is that it will be unused. Dealing with sockets/servers from the cluster module is also an open problem.

In theory we could walk open persistent handles and filter on socket handles belonging to a particular server instance but that is pretty horrid and inefficient.

@hunterloftis Oh wow, that's awesome! I had ended up deciding to disable keep-alive altogether, so I'm very happy you told me this. Thanks!

@bnoordhuis I was thinking about the same solution. Seems pretty nasty, but most people are probably fine with close() not being 100% efficient. I'm guessing they will be less fine with typical runtime of an HTTP server slowing down because of the added bookkeeping. Nasty, but if it works...

when we open these things in libuv we should definitely be doing setsockopt(2) with SO_LINGER

FYI - Here's a simple implementation of tracking & closing keep-alive connections https://github.com/thedillonb/http-shutdown

@isaacs has a super simple module for this that I use: https://github.com/isaacs/server-destroy

@BrandonZacharie That module destroys everything immediately, including active connections, and does nothing to preserve the graceful shutdown mechanics of the close() method.

Just a quick note, the problem is even worse when using the cluster module.

As disconnect is called in cluster.js it waits til all servers close. When using keep alive connections(e.g.: AWS Load Balancer), this never occurs as although the server socket is closed, keep alive client sockets never are.

This causes our workers to never restart using a cluster process manager like PM2.

My simple solution is to destroy sockets on request finish:

server.on('request', (request, response) => {
	response.on('finish', () => request.socket.destroy());
});

This way, when you call server.close() only real in-progress requests are considered before closing.

The problem with that is that it denies clients the ability to use HTTP keep-alives to minimise latency when requesting multiple resources from your server.

Actually, it's even worse than that. Because a client issuing an HTTP keep alive request won't be expecting the socket to close, it will go ahead and attempt to use its socket to talk to your server and won't realised until after the TCP timeout that the connection is dead.

I just published a small, simple, heavily tested module to solve this in userland:

const server = stoppable(http.createServer(handler))
server.stop()
  • does the bookkeeping on connection and req/res. There's no way around this.
  • stops accepting new reqs immediately
  • waits for in-flight requests up to a grace period

I'll go ahead and close this out. server.close() is working as documented and as #2642 (comment) demonstrates, it's extensible enough to make it work for different use cases.

@hunterloftis - That little stoppable module looks great. Excellent motivation documentation, unobtrusive and consistent with current apis, respectful of TCP by sending a FIN packet, love it.

which node release is this behavior fixed in?

@aoberoi
Unfortunately, it is not fixed (some argue that it's working as intended). I would suggest having a look at the excellent stoppable module.

stoppable seems very inactive and not enough for me, also has issues without reply, so I ended with:
https://github.com/LuKks/like-server
Would be great if we can test it more.

This is insane.. Also spent 8 hours trying to understand why a server close would take 5 seconds, well that's because of this because the keepAliveTimeout defaults to 5000

Even server.closeAllConnections(); doesn't force close them, maybe this needs a force argument, otherwise this really shouldn't be a closed issue

@Tofandel Sounds implausible, it basically calls socket.destroy() on all connections it knows about. Please put together a small test case and open a new issue if you're sure that is what's happening.

Consider this module: https://github.com/LuKks/graceful-http

I based the library on this (plus on real usage):
https://blog.dashlane.com/implementing-nodejs-http-graceful-shutdown/
That blog post explains the problem and solution perfectly!

My simple solution is to destroy sockets on request finish:

server.on('request', (request, response) => {
	response.on('finish', () => request.socket.destroy());
});

This way, when you call server.close() only real in-progress requests are considered before closing.

I am on node.js 16.15.0 and by using your suggestion, able to invoke close callback properly

server.close((err) => {
  if (err) {
    console.error(err);
  }
  
  console.info("http server closed successfully. Exiting!");
});

best part is I don't have to use process.exit(0) so other processes continue working and won't cluttered by explicit process.exit

After seeing node still get the basics wrong. I'm glad I left the node world years ago. The fragile castles of sand are real in this realm. Godspeed everyone stuck here.