korlibs-archive / korio

Korio: Kotlin cORoutines I/O : Virtual File System + Async/Sync Streams + Async TCP Client/Server + WebSockets for Multiplatform Kotlin 1.3

Home Page:https://korlibs.soywiz.com/korio/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hashing a file (MD5) without having to read the whole file in memory

saket opened this issue · comments

Hello, what would be an efficient way of comparing file content that does not involve loading the entire content into memory?

fun equalsContent(content: String): Boolean {
  TODO()
}

If the content you want to compare is a String, I think the best way is to actually load the file in memory first. You are already consuming that memory. Also you might want to specify a charset encoding just in case it is not UTF

Gotcha. Let me ask a different question: is there a fast way to calculate md5 hash of a file using korio? Somewhat along the lines of okio: https://stackoverflow.com/a/61217039/2511884

Initially there was a MD5 implementation in KorIO, but I moved it to krypto here: https://github.com/korlibs/krypto/blob/master/krypto/src/commonMain/kotlin/com/soywiz/krypto/MD5.kt

Since krypto is not aware of KorIO, it doesn't have the concept of Stream you would need an extension method to connect both.

Without checking in code, that would look something like this:

suspend fun VfsFile.hash(algo: HasherFactory) = openRead().use { hash(algo) }
suspend fun AsyncStream.hash(algo: HasherFactory): Hash {
    return algo.create().also {
         val temp = ByteArray(0x1000)
         while (true) {
               val count = read(temp)
               if (count <= 0) break
               it.update(temp, 0, count)
         }
    }.digest()
}

If there was an artifact including korio and krypto, that would be possible to include those methods.
Maybe we can include krypto in korio 2.0 and add those methods. Since all targets have some kind of DCE I think it shouldn't be a problem.

Really nice. I look forward to the next release for using this, thank you!

Sure. Have in mind that you can use it already by including krypto and adding the provided extension methods to your project.
2.0.0 is scheduled for Kotlin 1.4.20