Check read reference counters in files before closing
simerplaha opened this issue · comments
Current behaviour
Currently compaction is allowed to close files without checking if its being read by another thread which requires reads to set checkpoints incase the file was closed while the read was in progress.
New behaviour
The need for checkpoints can be removed completed if compaction checks read reference counters
in files (#317) and delays closing files if a read is in progress.
Benefits
- This would improve performance and also remove the need for the use of
Bag[_]
(Future
) from the API therefore making the APIs simpler. - This would also guarantee that all read APIs would be non-blocking without using a
Future
.
@simerplaha if you could point me to some areas maybe I could contribute a bit - this is super annoying in the project I'm working on
Yea I bet it's annoying.
For this issue we need a lightweight solution around read APIs in all file types that tells the sweepers to close or delete files only if they are not being currently read.
Currently sweepers close and delete files after a configured deadline, which requires reads APIs (get
, stream
etc) to set checkpoints so they can continue from previously failed (if any) checkpoint. This is a complex solution and should be replaced/removed.
I doubt there would be any need to make changes to compaction itself, but if there is, that code is here.
This is sizeable task to implement. I'm not sure it's going to be a quick one for anyone just starting off.
I guess you'd need to work from the tag v0.16.2 for the last release which was 2 years ago.