`ZipVfs` out of heap space.
Kesanov opened this issue · comments
I tried to unzip a 1GB ZIP file with ZipVfs
and the following code:
val data = ZipVfs(dir["archive.zip"].open())
for (name in data.listNames()) {
data[name].copyTo(dir["archive::$name"])
}
However, this crashed with out of memory error:
java.lang.OutOfMemoryError: Java heap space
at com.soywiz.kmem.ByteArrayBuilder.<init>(ByteArrayBuilder.kt:10)
at com.soywiz.korio.compression.CompressionMethodKt.uncompress(CompressionMethod.kt:56)
at com.soywiz.korio.compression.CompressionMethodKt.uncompress$default(CompressionMethod.kt:49)
at com.soywiz.korio.file.std.ZipVfsKt$ZipVfs$Impl.open(ZipVfs.kt:64)
at com.soywiz.korio.file.Vfs.openInputStream$suspendImpl(Vfs.kt:65)
at com.soywiz.korio.file.Vfs.openInputStream(Vfs.kt)
at com.soywiz.korio.file.VfsFile.openInputStream(VfsFile.kt:69)
at com.soywiz.korio.file.VfsFile.copyTo(VfsFile.kt:57)
at Mod$download$2.invokeSuspend(main.kt:188)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665)
Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap space
at com.soywiz.kmem.ByteArrayBuilder.<init>(ByteArrayBuilder.kt:10)
at com.soywiz.korio.compression.CompressionMethodKt.uncompress(CompressionMethod.kt:56)
at com.soywiz.korio.compression.CompressionMethodKt.uncompress$default(CompressionMethod.kt:49)
at com.soywiz.korio.file.std.ZipVfsKt$ZipVfs$Impl.open(ZipVfs.kt:64)
at com.soywiz.korio.file.Vfs.openInputStream$suspendImpl(Vfs.kt:65)
at com.soywiz.korio.file.Vfs.openInputStream(Vfs.kt)
at com.soywiz.korio.file.VfsFile.openInputStream(VfsFile.kt:69)
at com.soywiz.korio.file.VfsFile.copyTo(VfsFile.kt:57)
at Mod$download$2.invokeSuspend(main.kt:188)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665)
The issue seems that ZipVfs.kt:64
attempts to read whole content into ByteArray
:
val compressed = compressedData.readAll().uncompress(method)
Any workaround before this gets fixed?
As a workaround I guess you can copy that file into your project and modify it. Maybe you can remove the CRC check and change compressedData.readAll().uncompress(method)
with compressedData.uncompressed(method)
? (unchecked)
I have defined my own ZipVfs
with removed CRT check and readAll
. However, this broke unzipping on windows (Mingwx64).
The following code works fine on JVM, but never finishes copying the first file on Windows (no errors, infinite loop).
If I switch my ZipVfs
file with korio
ZipVfs
, it will work on Windows just fine:
ZipVfs(file).listRecursive{ true }.collect {
it.copyTo(dir[it.path])
}
You can try to apply this patch somehow. I had to update other files too:
korlibs-archive/korge-next@512dcd6
Or use korge-next. Download the repo ./gradlew publishToMavenLocal
and using 2.0.0.999
as version for the artifacts including korge-gradle-plugin
So, I have tried the new version. However, I noticed extreme degradation of performance of copyTo
.
- With JVM, old
ZipVfs
copyTo
copies~100 MB/s
. - On Windows, old
ZipVfs
copyTo
copies~15 MB/s
. - On Windows, new
ZipVfs
copyTo
copies~150 KB/s
!
Looks like the source of overhead is the uncompress
method (both for DeflateNative
and DeflatePortable
).
Where logtime
is a toplevel function suspend fun logtime(msg: String, fn: suspend () -> Unit) { fn() }
.
I am a little surprised that I still see suspend
in the callgraph, as I ran the code inside runBlockingNoJs
.
I thought that would remove any suspension points...
It might be related to korlibs/korge#591.
@Josef-Vonasek I have performed a huge performance improvement on korge-next. Can you try again your checks and put here the results? (JVM + native release)
I tried to compile the whole korge-next
with ./gradlew publishToMavenLocal
. However, it takes more than hour and I keep getting this compile error:
> Task :korge-editor:compileKotlinMingwX64
e: D:\korge-next\korge-editor\build\platforms\nativeDesktop\entrypoint\main.kt: (4, 8): Unresolved reference: main
e: D:\korge-next\korge-editor\build\platforms\nativeDesktop\entrypoint\main.kt: (8, 14): No value passed for parameter 'args'
Is there a way to compile, publish and import only the updated korio
?
Probably all the other modules have been already published even if korge-editor failed. You can also disable it in the settings.gradle.kts
Tested. It is much faster now: 10-15MB/s
. Still 10x slower than JVM, but fast enough for my use-case. Thank you for the fix.
Okay. Then let's keep this as resolved. Hopefully we will figure out the issue of native. It uses zlib by default, so it should be as fast as in the JVM, might be an issue related to some overhead somewhere else in coroutines, I/O or something.
Feel free to create a separate issue for that. This issue at least is fixed (the out of space).