typst / comemo

Incremental computation through constrained memoization.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Why use siphasher?

FilipAndersson245 opened this issue · comments

Hello,
I was just wondering why specificaly this library use siphash as the default hashing algorithm?
As we hash u128 I think ahash would be a good alternative as it seem to be upwards of 20x faster for those sizes

image

We need collision resistance because we assume that hash collisions never happen. I don't know about aHash, but FnvHash and FxHash are not designed for such use cases. The decision to use SipHash follows rustc's incremental compiler: rust-lang/rust#107925

It seem as ahash would not be suitable then, they use a u64 hash, but their hash quality seem to be better then FnvHash

We need collision resistance because we assume that hash collisions never happen. I don't know about aHash, but FnvHash and FxHash are not designed for such use cases.

SipHash128 is not a collision-resistant hash function. If you read the original SipHash whitepaper they even say that it is "obvious" that it is not.

The decision to use SipHash follows rustc's incremental compiler: rust-lang/rust#107925

See rust-lang/rust#10389 for extensive discussion.

From your source code:

comemo/src/prehashed.rs

Lines 22 to 27 in 1275982

/// Because comemo uses high-quality 128 bit hashes in all places, the risk of a
/// hash collision is reduced to an absolute minimum. Therefore, this type
/// additionally provides `PartialEq` and `Eq` implementations that compare by
/// hash instead of by value. For this to be correct, your hash implementation
/// **must feed all information relevant to the `PartialEq` impl to the
/// hasher.**

Even when used with a CSPRNG-provided random key, the original use case for SipHash was hash functions, which do a full equality check done after the hash equality check, to verify there are no collisions.

The idea here isn't that an attacker can't find a collision, but rather than we don't run into a collision accidentally. Reading this comment rust-lang/rust#10389 (comment), it appears that 128-bit SipHash gives us that.

I admit that the docs don't make that fully clear.