`closeable-seq` may end prematurely after GC?
jerrypnz opened this issue · comments
We have some code that reads an InputStream
into chunks of fixed size ByteBuffer
s using (convert s (seq-of ByteBuffer))
. When testing the code, it's found that the returned seq doesn't always have all the data when the input stream is relatively big.
We were able to reproduce it using the following code snippet:
user> (count (convert (io/input-stream (io/file "bigfile")) (seq-of ByteBuffer)))
;; => 26955
user> (count (convert (io/input-stream (io/file "bigfile")) (seq-of ByteBuffer)))
;; => 79920
user> (count (convert (io/input-stream (io/file "bigfile")) (seq-of ByteBuffer)))
;; => 86093
user> (count (convert (io/input-stream (io/file "bigfile")) (seq-of ByteBuffer)))
;; => 27063
As the above snippet shows, the count
of the seq returned by convert
keeps changing. We suspect it might be cause by the finalize
in closeable-seq
. We think what's happening is that the head of the seq got GCed when it's halfway through consuming it, which triggered a finalize
call that caused the source stream to be closed prematurely.
I'm wondering if closeable-seq
should stop overriding finalize
and leave it to the user to close the source, like line-seq
does.
That's a serious oversight on my part, thanks for catching this. I think the only two potential solutions are to create a reference counting scheme such that it's only closed once all instances have been finalized, or to just not do any of the finalization logic. The latter is kind of a breaking change, but I'm guessing the "close on exhaustion" logic is what most (all?) people are relying on.
Not sure how other people are using it but I'd always have something like with-open
at a higher level that closes the source. Maybe introduce a :close?
option to convert
which decides if it automatically closes the source?
Can confirm closeable-seq
's finalize is definitely the problem, because disabling it fixes the error.
While I like the idea of leaving it up to users to close resources, that still leaves conversions like [File (seq-of ByteBuffer)]
, which will fail to close their own resources.