rust-secure-code / safety-dance

Auditing crates for unsafe code which can be safely replaced

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Audit seahash

chills42 opened this issue · comments

Seahash is an implementation of a fast non-cryptographic hashing algorithm, comparable to xxHash or MetroHash.

https://gitlab.redox-os.org/redox-os/seahash

From a cursory look, it seems to have 5 unsafe blocks:

  1. unaligned reads to turn a &[u8] into a u64
    This looks fine to me
    Maybe the big unsafe block could be turned into smaller blocks inside each match arm, and maybe the byte-sized reads can be turned into slice indexing operations instead of raw pointer reads (Since rustc might be able to elide the bounds checks, but I'm not sure)
    It correctly uses as_ptr to get a pointer to the entire slice instead of a pointer to the first element, and it returns 0 for anything that does not satisfy the buf.len() < 8 assumption without triggering UB. It also correctly uses unaligned reads
  2. Read a u64 from a *const u8
    This also looks sound to me. It's an unsafe fn, although pherhaps the requirement that the pointer points to 8 bytes of data could be specified more explicitly in the doc comment for it
    It correctly uses unaligned reads
  3. A test case for (2)
    Not much to say about this, looks fine
  4. A 121 line unsafe block in State::hash
    This has two parts:
    First it iterates through the buffer using end and current raw pointers and reading 4 u64's using the function from (2). It obtains the 32-byte ending chunk pointer by doing buf.as_ptr().offset(buf.len() as isize & !0x1F) which looks fine to me but I would like someone else to double check. The u64 reads seem fine since (2) uses unaligned reads
    After that it handles the remainder
    It calculates the remaining bytes by doing let mut excessive = buf.len() as usize + buf.as_ptr() as usize - end_ptr as usize;. I'm assuming this is fine but I think it would be more clear if it has parenthesis around the addition to make it a bit more clear that this never underflows
    Then it just hashes the remaining bytes using the functions from (1) and (2). As far as I can tell, all the match arms are sound.
  5. A 97 line unsafe block in SeaHasher::push_bytes
    I did not look into this. At a glance it seems to mostly consist of a copy_to_nonoverlapping call and then something similar to what's done in (4)

It does not have any dependencies, so there is nothing to audit with regards to deps

I also ran the tests under miri and they all passed

In summary, as far as I can tell the crate is sound (Outside of (5) which I did not look into), but can probably be improved a bit with comments and smaller unsafe blocks. Another thing we might wanna do is check if we can replace unsafe code with safe alternatives without performance regressions. The crate has criterion benchmarks which should make that easier to attempt

I'm pretty sure point 1 can be refactored using u32::from_ne_bytes(buf[..4].try_into().unwrap()) instead of unaligned reads. Since the length is known in advance, the optimizer should elide bounds checks and the unreachable panic in .unwrap().