Disk space not released
funky-eyes opened this issue · comments
What can we help you with?
I found that after running for a while, the disk occupancy is not actually released, I am using raft cluster mode and I don't know how to troubleshoot it
Why is this happening? I found that after I restarted the node again, the disk space was freed up
Where would you expect to find this information?
I've clearly seen this file in s3, but locally he's still not cleaned up!
Hi @funky-eyes
By "not cleaned up" you mean they exist as "<old_file_name>.deleted"?
Hi @funky-eyes By "not cleaned up" you mean they exist as "<old_file_name>.deleted"?
They no longer exist in the catalogue. Can you see the picture I sent? The deleted file is still being referenced, resulting in the disk space not released.
Hi @funky-eyes By "not cleaned up" you mean they exist as "<old_file_name>.deleted"?
Could this be due to the operating system? I've noticed that after a while, the disk is actually freed, but it's minutes or even tens of minutes before it's freed!
I set a topic again, remote.storage.enable=false, retention.ms=180000, and when it is cleaned up on disk, the disk space is freed up almost in real-time
I also gave feedback on this issue in the kafka community:https://issues.apache.org/jira/browse/KAFKA-16378
We're looking into this, trying to first understand if it's the plugin's or broker's problem.
We're looking into this, trying to first understand if it's the plugin's or broker's problem.
Thank you very much for your intervention. The cluster I deployed is in kraft mode, and the code used is the latest main branch packaged and deployed, and I found that as long as I use jcmd pid GC.run
, the disk space occupation will be released immediately, but when there is no gc, some files are not released, and there is no error-level log output in the log.
And I'm using s3's tiered storage implementation
Seems to be really an issue in the plugin. Will be fixed in #516
Seems to be really an issue in the plugin. Will be fixed in #516
Thanks, I'll pull it up later and recompile it locally for testing.
Seems to be really an issue in the plugin. Will be fixed in #516
I understand that the purpose of this PR is to introduce a ClosableInputStreamHolder, which uniformly handles the closing of all InputStreams generated during the copyLogSegmentData phase, ensuring that the streams are correctly closed. Is my understanding correct?
I understand that the purpose of this PR is to introduce a ClosableInputStreamHolder, which uniformly handles the closing of all InputStreams generated during the copyLogSegmentData phase, ensuring that the streams are correctly closed. Is my understanding correct?
Yeah, that's correct. We forgot to close those streams and through this the open files. They lingered open until the Java internal cleaning machinery kicks in and closes the files.
I understand that the purpose of this PR is to introduce a ClosableInputStreamHolder, which uniformly handles the closing of all InputStreams generated during the copyLogSegmentData phase, ensuring that the streams are correctly closed. Is my understanding correct?
Yeah, that's correct. We forgot to close those streams and through this the open files. They lingered open until the Java internal cleaning machinery kicks in and closes the files.
Thank you for your eagerness to help.