korlibs-archive / korio

Korio: Kotlin cORoutines I/O : Virtual File System + Async/Sync Streams + Async TCP Client/Server + WebSockets for Multiplatform Kotlin 1.3

Home Page:https://korlibs.soywiz.com/korio/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`ZipVfs` out of heap space.

Kesanov opened this issue · comments

commented

I tried to unzip a 1GB ZIP file with ZipVfs and the following code:

val data = ZipVfs(dir["archive.zip"].open())
for (name in data.listNames()) {
    data[name].copyTo(dir["archive::$name"])
}

However, this crashed with out of memory error:

java.lang.OutOfMemoryError: Java heap space
	at com.soywiz.kmem.ByteArrayBuilder.<init>(ByteArrayBuilder.kt:10)
	at com.soywiz.korio.compression.CompressionMethodKt.uncompress(CompressionMethod.kt:56)
	at com.soywiz.korio.compression.CompressionMethodKt.uncompress$default(CompressionMethod.kt:49)
	at com.soywiz.korio.file.std.ZipVfsKt$ZipVfs$Impl.open(ZipVfs.kt:64)
	at com.soywiz.korio.file.Vfs.openInputStream$suspendImpl(Vfs.kt:65)
	at com.soywiz.korio.file.Vfs.openInputStream(Vfs.kt)
	at com.soywiz.korio.file.VfsFile.openInputStream(VfsFile.kt:69)
	at com.soywiz.korio.file.VfsFile.copyTo(VfsFile.kt:57)
	at Mod$download$2.invokeSuspend(main.kt:188)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665)
Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap space
	at com.soywiz.kmem.ByteArrayBuilder.<init>(ByteArrayBuilder.kt:10)
	at com.soywiz.korio.compression.CompressionMethodKt.uncompress(CompressionMethod.kt:56)
	at com.soywiz.korio.compression.CompressionMethodKt.uncompress$default(CompressionMethod.kt:49)
	at com.soywiz.korio.file.std.ZipVfsKt$ZipVfs$Impl.open(ZipVfs.kt:64)
	at com.soywiz.korio.file.Vfs.openInputStream$suspendImpl(Vfs.kt:65)
	at com.soywiz.korio.file.Vfs.openInputStream(Vfs.kt)
	at com.soywiz.korio.file.VfsFile.openInputStream(VfsFile.kt:69)
	at com.soywiz.korio.file.VfsFile.copyTo(VfsFile.kt:57)
	at Mod$download$2.invokeSuspend(main.kt:188)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665)
commented

The issue seems that ZipVfs.kt:64 attempts to read whole content into ByteArray:

val compressed = compressedData.readAll().uncompress(method)

Any workaround before this gets fixed?

As a workaround I guess you can copy that file into your project and modify it. Maybe you can remove the CRC check and change compressedData.readAll().uncompress(method) with compressedData.uncompressed(method)? (unchecked)

commented

I have defined my own ZipVfs with removed CRT check and readAll. However, this broke unzipping on windows (Mingwx64).

The following code works fine on JVM, but never finishes copying the first file on Windows (no errors, infinite loop).
If I switch my ZipVfs file with korio ZipVfs, it will work on Windows just fine:

ZipVfs(file).listRecursive{ true }.collect {
    it.copyTo(dir[it.path])
}

You can try to apply this patch somehow. I had to update other files too:

korlibs-archive/korge-next@512dcd6

Or use korge-next. Download the repo ./gradlew publishToMavenLocal and using 2.0.0.999 as version for the artifacts including korge-gradle-plugin

commented

So, I have tried the new version. However, I noticed extreme degradation of performance of copyTo.

  1. With JVM, old ZipVfs copyTo copies ~100 MB/s.
  2. On Windows, old ZipVfs copyTo copies ~15 MB/s.
  3. On Windows, new ZipVfs copyTo copies ~150 KB/s!
commented

Looks like the source of overhead is the uncompress method (both for DeflateNative and DeflatePortable).

commented

Output of profile:
image
image

Where logtime is a toplevel function suspend fun logtime(msg: String, fn: suspend () -> Unit) { fn() }.

I am a little surprised that I still see suspend in the callgraph, as I ran the code inside runBlockingNoJs.
I thought that would remove any suspension points...
It might be related to korlibs/korge#591.

@Josef-Vonasek I have performed a huge performance improvement on korge-next. Can you try again your checks and put here the results? (JVM + native release)

commented

I tried to compile the whole korge-next with ./gradlew publishToMavenLocal. However, it takes more than hour and I keep getting this compile error:

> Task :korge-editor:compileKotlinMingwX64
e: D:\korge-next\korge-editor\build\platforms\nativeDesktop\entrypoint\main.kt: (4, 8): Unresolved reference: main
e: D:\korge-next\korge-editor\build\platforms\nativeDesktop\entrypoint\main.kt: (8, 14): No value passed for parameter 'args'

Is there a way to compile, publish and import only the updated korio?

Probably all the other modules have been already published even if korge-editor failed. You can also disable it in the settings.gradle.kts

commented

Tested. It is much faster now: 10-15MB/s. Still 10x slower than JVM, but fast enough for my use-case. Thank you for the fix.

Okay. Then let's keep this as resolved. Hopefully we will figure out the issue of native. It uses zlib by default, so it should be as fast as in the JVM, might be an issue related to some overhead somewhere else in coroutines, I/O or something.
Feel free to create a separate issue for that. This issue at least is fixed (the out of space).