Memory leak when a stream in the pipeline gets destroyed unexpectedly
alolis opened this issue · comments
Hello,
Background
We are using needle
in one of our projects and we have noticed a slow but steady increase in the tcp memory buffers. The linux command used to view the tcp memory pages allocation is cat /proc/net/sockstat
.
Every time we killed the service, the memory consumption dropped significantly so we started digging.
Before kill:
After kill:
After plenty of tests we started noticing that the needle streams we use, do not fire the close
event. Now, I am not sure where exactly the problem is, but if the close
event does not fire, it means that the underlying resources have not been destroyed. If they are not getting destroyed than that is consistent with the tcp memory pages increase we have been seeing.
How to reproduce
The following sample code shows the problem:
const needle = require('needle');
const {pipeline} = require('stream');
const {createWriteStream} = require('fs');
const url = 'https://releases.ubuntu.com/21.04/ubuntu-21.04-desktop-amd64.iso.torrent';
const readStream = needle.get(url);
readStream.on('end', () => console.log('needle END event fired')); // FIRES JUST FINE
readStream.on('close', () => console.log('needle CLOSE event fired')); // DOES NOT FIRE
const writeStream = createWriteStream('ubuntu-21.04-desktop-amd64.iso.torrent');
pipeline(readStream, writeStream, (error) => {
if (error) {
console.log(error);
}
// By uncommenting the next line, the CLOSE event WILL fire, but it shouldn't be required.
// readStream.destroy();
});
Environment
node.js version: v12.16.1
needle version: v2.8.0
Please let me know if I can help in any other way to get to the bottom of this. Thank you for your time.
Hmmm, this is indeed strange. Are you using the 'close' event to trigger a specific behaviour and/or perform additional actions on the stream or connection?
Hi @tomas,
No, we are not doing anything else after that. The 'close' event is not used at all in our production, I've only added it for debugging purposes to figure out what was going on with the tcp memory increase.
The flow of the actual code is very similar to the sample code I gave you above.
What options are you using regarding the connection? Connection: keep-alive
?
Also, does this also happen on earlier versions of Node or Needle?
I am running the sample code above as-is without any changes with node main.js
and I reproduce the problem. No extra connection options. Did you run it and got different results?
I was also able to reproduce it with needle v2.4.0
and needle v2.5.2
as well.
The problem is also reproducible with node.js v10.16.0
as well from a test run I just did.
I see. I'll take a look at this later today. But you certainly shouldn't have to destroy the stream manually.
I'm thinking it might have to do with this (if you want to fiddle with the code right away).
Thanks! I will have a look at the link you provided as well.
That is meant to handle unexpected errors in the underlying socket. Take a look at this test for an example.
Does removing that line fix the issue, by the way?
Hey @tomas ,
Sorry for the small delay.
So, I run the test again with DEBUG=needle* node main
and the following is the full debug output:
DEBUG=needle* node main.js
needle Making request #1 {
localAddress: undefined,
lookup: undefined,
protocol: 'https:',
host: 'releases.ubuntu.com',
port: 443,
path: '/21.04/ubuntu-21.04-desktop-amd64.iso.torrent',
method: 'get',
headers: {
accept: '*/*',
'user-agent': 'Needle/2.7.0 (Node.js v12.16.1; darwin x64)',
host: 'releases.ubuntu.com'
}
} +0ms
needle Got response 200 {
date: 'Thu, 05 Aug 2021 14:27:27 GMT',
server: 'Apache/2.4.29 (Ubuntu)',
'last-modified': 'Thu, 22 Apr 2021 14:43:25 GMT',
etag: '"34978-5c090b3a9db31"',
'accept-ranges': 'bytes',
'content-length': '215416',
connection: 'close',
'content-type': 'application/x-bittorrent'
} +523ms
MAIN.JS END event fired <==== This is console.log from main.js
needle on_socket_end returned: 1, this.writeable: undefined, this.destroyed true +515ms <== This is an extra debug msg I added
So the last two lines on the log above indicate that the socket END event is fired normally and the code block from here does NOT execute, which from what I understand is the correct behavior unless something really unexpected has happened.
However, I did force line block to execute (which means that this.destroy
was called) and the CLOSE event still does not fire. Full debug run:
DEBUG=needle* node main.js
needle Making request #1 {
localAddress: undefined,
lookup: undefined,
protocol: 'https:',
host: 'releases.ubuntu.com',
port: 443,
path: '/21.04/ubuntu-21.04-desktop-amd64.iso.torrent',
method: 'get',
headers: {
accept: '*/*',
'user-agent': 'Needle/2.7.0 (Node.js v12.16.1; darwin x64)',
host: 'releases.ubuntu.com'
}
} +0ms
needle Got response 200 {
date: 'Thu, 05 Aug 2021 14:34:31 GMT',
server: 'Apache/2.4.29 (Ubuntu)',
'last-modified': 'Thu, 22 Apr 2021 14:43:25 GMT',
etag: '"34978-5c090b3a9db31"',
'accept-ranges': 'bytes',
'content-length': '215416',
connection: 'close',
'content-type': 'application/x-bittorrent'
} +512ms
MAIN.JS END event fired <==== This is console.log from main.js
needle Request error Error: Remote end closed socket abruptly.
at TLSSocket.on_socket_end ('./needle/lib/needle.js:500:17)
at processTicksAndRejections (internal/process/task_queues.js:79:11) +498ms
needle Already finished, stopping here. +0ms
Does removing that line fix the issue, by the way?
Nope, I removed the whole block, and does not seem to do any difference.
// if (!socket.on_socket_end) {
// socket.on_socket_end = on_socket_end;
// socket.once('end', function() { process.nextTick(on_socket_end.bind(socket)) });
// }
I also added a socket.once('close')
handler above the line here and the socket seems to close.
Hm, I see that you are using a PassThrough
stream underneath, which is a transform stream and those do not emit a close
event.
Ahh yes, indeed. But why would that trigger the memory increase you're seeing then?
Ahh yes, indeed. But why would that trigger the memory increase you're seeing then?
Oh man...goodbye sleep...I will investigate some more...
I think I found the memory leak but I want to make 100% sure so I will update you in the morning. I need to re-run everything with a clear head!
Sure thing!
Hey @tomas,
Sorry for the delay but I was trying to get to the bottom of this.
So, the memory leak is real. If the readable stream, for some reason gets destroyed before properly ending, then the following error occurs:
Error [ERR_STREAM_PREMATURE_CLOSE]: Premature close
at ServerResponse.onclose (internal/streams/end-of-stream.js:80:15)
at ServerResponse.emit (events.js:323:22)
at ServerResponse.EventEmitter.emit (domain.js:482:12)
at Socket.onServerResponseClose (_http_server.js:209:44)
at Socket.emit (events.js:323:22)
at Socket.EventEmitter.emit (domain.js:482:12)
at TCP.<anonymous> (net.js:668:12)
However, the above will not trigger an 'error' event (haven't figured out why) which means that all the streams that were piped, will be left hanging.
One solution is to use stream.pipeline
instead of .pipe
which handles the above error automatically, plus, proper cleanup of the whole pipeline. With the changes I did in this pull request, the tcp memory goes back to 0 even when an ERR_STREAM_PREMATURE_CLOSE
occurs.
I did try to see if I can make it work without using stream.pipeline
by using the IncomingMessage
'aborted' and 'error' events on the resp
object like so:
var request = protocol.request(request_opts, function(resp) {
resp.on('aborted', function() {
// Did not fire on ERR_STREAM_PREMATURE_CLOSE
});
resp.on('error', function() {
// Did not fire on ERR_STREAM_PREMATURE_CLOSE
});
resp.on('close', function() {
// Did fire so perhaps some error handling can be made here by checking this.destroyed ?
});
.....code
}
Based on the documentation here the above events should have fired:
In the case of a premature connection close after the response is received, the following events will be emitted in the following order:
'socket'
'response'
'data' any number of times, on the res object
(connection closed here)
'aborted' on the res object
'error' on the res object with an error with message 'Error: aborted' and code 'ECONNRESET'.
'close'
'close' on the res object
I have also searched to see if anyone else has seen a similar issue and found this and this. Unfortunately, as I said above, the event handlers did not fire so I might be missing something from within the needle
code that affects the listeners or they simply do not fire because they do not need to.
Feel free to do your own tests as well and to check my pull request to see the exact changes I am referring to. Let me know how do you want to handle this so we can try and fix it!
Wow, you really DID get to the bottom of it! Kudos!
I'm a bit busy now but will take a look later.
Cheers! I will try to figure out why the hell the error
event does not trigger on premature close in the meanwhile. Probably some crazy node.js streams thing.
Issue #280 from 2019 is basically the same problem but reproduced in a different way. If a stream in the pipeline gets destroyed before ending properly, a memory leak happens since the file descriptor stays open. You can also reproduce the ERR_STREAM_PREMATURE_CLOSE
if you use the following sample code:
const needle = require('needle');
const {pipeline} = require('stream');
const {createWriteStream} = require('fs');
var url = 'SOME BIG ASS REMOTE FILE URL';
const readStream = needle.get(url);
const writeStream = createWriteStream(`./out.dat`);
pipeline(readStream, writeStream, (error) => {
if (error) {
console.log(error);
// By uncommenting the next line, the underlying request stream will be destroyed properly.
// Needs to be handled internally.
// readStream.request.destroy();
}
});
setTimeout(function () {
console.log("destroying write stream...");
writeStream.destroy();
}, 2000);
setInterval(function () {
// Keep alive
}, 3000);
From what I understand, if you do not wish to use stream.pipeline
internally - although I STRONGLY recommend that you do -, you need to attach event handlers on each stream on the pipe, since .pipe
is not build to forward anything, and gracefully destroy the whole chain.
I have also updated the title of this issue because it was misleading.
Since this is a serious issue and has been affecting us severely I have pushed my branch to production after further testing yesterday and I can confirm that tcp memory usage seems to be very low now, as expected.
Looking forward to the officially fixed version ;)
Hey @alolis, I'm running the "How to reproduce" code you included in the first post, but with the merged PR the close
event still doesn't fire unless calling readStream.destroy()
. I'm just wondering if you had tested that or if the .pipeline
change was simply supposed to handle the memleak and not the close
event not being fired.
Hey @tomas! That's normal since I did not change the PassThrough
stream you were using underneath and those stream types do not fire a close
event as per documentation!
I see. I have another branch where I make sure all pipe destinations also emit the done
event, so we can have a consistent API regardless of whatever the final stream is (Needle's Passthrough or a FS writeStream).
If you are asking how to do this with stream.pipeline
then you can do it in the pipeline callback:
stream.pipeline(....., (err) => {
if (err) {
debug(err);
}
out.emit('done', err || null);
}
Is this what you want?
Nope, not really. Thanks anyway :)