Audit rand

Question

Audit rand

TyPR124 opened this issue 5 years ago · comments

Currently the most downloaded crate on crates.io.

Contains quite a few unsafe

Functions  Expressions  Impls  Traits  Methods  Dependency

0/0        37/95        0/0    0/0     0/0      !  rand 0.7.2
0/4        4/80         0/0    0/0     0/1      !  ├── getrandom 0.1.13
0/0        0/0          0/0    0/0     0/0      ?  │   └── cfg-if 0.1.10
0/0        0/0          0/0    0/0     0/0      ?  ├── rand_chacha 0.2.1
0/0        0/0          0/0    0/0     0/0      ?  │   ├── c2-chacha 0.2.3
2/2        469/500      0/0    0/0     14/22    !  │   │   └── ppv-lite86 0.2.6
0/0        22/22        0/0    0/0     0/0      !  │   └── rand_core 0.5.1
0/4        4/80         0/0    0/0     0/1      !  │       └── getrandom 0.1.13
0/0        22/22        0/0    0/0     0/0      !  ├── rand_core 0.5.1
0/0        0/0          0/0    0/0     0/0      ?  └── rand_pcg 0.2.1
0/0        22/22        0/0    0/0     0/0      !      └── rand_core 0.5.1

Lokathor · Answer 1 · Mon Nov 04 2019 14:03:45 GMT+0800 (China Standard Time)

I audited some of it in the past and sent in a few small fixes.

Unfortunately, having them take on extra dependencies is less likely because of how central the crate is, but there is probably still space to improve.

Phlosioneer · Answer 2 · Wed Nov 20 2019 13:41:22 GMT+0800 (China Standard Time)

This task should probably be broken up into the sub-crates. A lot of the unsafe code is mostly to use OS functions while in no_std mode, from what I've gathered. The first step is to figure where nontrivial unsafe code is.

Diggory Hardy · Answer 3 · Wed Apr 01 2020 21:18:44 GMT+0800 (China Standard Time)

Extracted from a reddit post:

[..] inspired me to have a quick look at uses of unsafe in the Rand crate. It would seem that uses can be categorised under:

Type conversions. For example, accessing an [i16] as &mut [u8] is unsafe only in that interpretation of the values is not so well defined (in practice, one must byte-swap on Big or Little Endian to get consistent results). What does this mean in practice? (a) that the type prover cannot constrain the output values [which it couldn't anyway in this case], and (b) that results may be platform dependent. So this is not memory safety, but still unsafe.
The reverse type conversion (e.g. [u8; 8] to u64). This does come with a memory safety issue: alignment, and thus we use ptr::read_unaligned.
ptr::copy_nonoverlapping: in our uses the borrow checker should (in theory) be able to prove that the source and target do not alias, that both regions are valid, and that the target values are valid (since the target is an integer array which does not have invalid values). So it may be viable for the type system to validate this in the future.
Calling core::ptr::NonNull::as_mut on a thread-local object. As far as I understand, the unsafety comes from the inability of the borrow checker to guard against concurrent mutation. We could instead use Rc and rely on the std lib's more complex abstraction over unsafe, but is that an improvement?
Accessing some intrinsics to support SIMD types. This is currently feature-gated and nightly only.
Constructing a char sampled from a fixed range. I guess this comes down to a choice of guaranteeing performance over safety, and may be the wrong choice in this instance (feel free to open a PR).
FFI to access platform functionality not available through std
To avoid a performance penalty associated with bounds checks.

So in my view, unsafe is a big hammer where often a much smaller, more specialised tool could do the job. I have in the past found memory safety issues in unsafe code which had nothing to do with the motivation for using unsafe, but which were nevertheless hidden by use of it. Better tools could go a long way to improving this situation, e.g. things like unsafe_assertion(i < len) or unsafe(concurrent_access).

It looks like this repository is focussed on memory safety, so I'd just like to quickly mention that Rand has a few other safety concerns: that generated keys/values are filled with random data, that RNGs are correctly initialised, that CSPRNG state is not inadvertently leaked, that CSPRNGs correspond to published test vectors, and a few other bits like fork detection.

Alex Gaynor · Answer 4 · Wed Apr 01 2020 21:35:27 GMT+0800 (China Standard Time)

To point number two, could those be replaced with u64::from_ne_bytes()?

Diggory Hardy · Answer 5 · Wed Apr 01 2020 21:37:40 GMT+0800 (China Standard Time)

@alex ~~no, but see this post to avoid redundant discussion.~~

Possibly yes actually, though I think it requires a more recent compiler than our current MSRV of 1.32.

Tyler Ruckinger · Answer 6 · Wed Apr 01 2020 23:04:50 GMT+0800 (China Standard Time)

u64::from_ne_bytes() is available on 1.32 so it should be doable while keeping MSRV 1.32, unless there's something else that would be needed.

Sergey "Shnatsel" Davidoff · Answer 7 · Wed Apr 01 2020 23:13:49 GMT+0800 (China Standard Time)

If you're converting from &[u8] instead of [u8; 8] you also need TryFrom, which has MSRV of 1.34

Tony Arcieri · Answer 8 · Wed Apr 01 2020 23:17:27 GMT+0800 (China Standard Time)

Alternatively you can create a [0u8; 8] and use copy_from_slice. The TryFrom method is definitely nicer though.

Sergey "Shnatsel" Davidoff · Answer 9 · Wed Apr 01 2020 23:20:26 GMT+0800 (China Standard Time)

The copy_from_slice approach can be fiddly, sometimes rustc doesn't optimize equivalent-looking code properly: https://godbolt.org/z/jmac5x

Alex Gaynor · Answer 10 · Wed Apr 01 2020 23:22:46 GMT+0800 (China Standard Time)

Has a Rust (LLVM?) bug been filed on that?

Sergey "Shnatsel" Davidoff · Answer 11 · Wed Apr 01 2020 23:25:57 GMT+0800 (China Standard Time)

Yes, rustc bug: rust-lang/rust#70439

Sergey "Shnatsel" Davidoff · Answer 12 · Sat Apr 04 2020 09:34:58 GMT+0800 (China Standard Time)

I've found some code that's unsound but doesn't pose a security issue and sent in a fix: rust-random/rand#959

Sergey "Shnatsel" Davidoff · Answer 13 · Sat Apr 04 2020 09:56:41 GMT+0800 (China Standard Time)

I've also managed to get rid of unsafe code in &[u32] to u64 conversion by leaning into the optimizer: rust-random/rand#963

Sergey "Shnatsel" Davidoff · Answer 14 · Sat Apr 04 2020 10:16:48 GMT+0800 (China Standard Time)

Another small reduction: rust-random/rand#962

Sergey "Shnatsel" Davidoff · Answer 15 · Mon Apr 06 2020 05:34:06 GMT+0800 (China Standard Time)

I've looked into https://github.com/rust-random/rand/blob/05a1273ea83eeb0c0ade64ea55600b7f1fa39ec5/rand_core/src/block.rs#L352-L373 and it seems this unsafe cannot be removed without degrading performance and/or a major refactoring. But memory safety is just one of many guarantees this code must uphold, so I'm not too concerned about the unsafe here.

On the other hand, the uses of unsafe in the following files seem avoidable:

https://github.com/rust-random/rand/blob/master/rand_core/src/le.rs is avoidable with an MSRV bump to 1.34 to be able to use u32::from_ne_bytes(src.try_into().unwrap())
https://github.com/rust-random/rand/blob/master/rand_core/src/impls.rs#L59 is definitely avoidable with bytemuck, probably also avoidable with std only on 1.34
The same probably goes for https://github.com/rust-random/rand/blob/master/src/rng.rs and https://github.com/rust-random/rand/blob/master/rand_chacha/src/chacha.rs as well, but I'm not entirely sure

Unfortunately, I probably won't have the time to make actual pull requests or look into the remaining unsafe code.

Sergey "Shnatsel" Davidoff · Answer 16 · Mon Apr 06 2020 05:36:24 GMT+0800 (China Standard Time)

MSRV bump from 1.32 to 1.34 should be harmless because even Debian Stable ships 1.34 by now.

Diggory Hardy · Answer 17 · Mon Apr 06 2020 16:11:07 GMT+0800 (China Standard Time)

Rust 1.34 is also nearly a year old. I don't see any problem bumping to this version for the 0.8 release, which is what the master branch is already working towards. (Maybe should ping @vks and @newpavlov to check, but I don't see any issue.)

Sergey "Shnatsel" Davidoff · Answer 18 · Thu Aug 27 2020 01:39:29 GMT+0800 (China Standard Time)

Copying from rust-random/rand#957:

Most unsafe code was removed in rust-random/rand#1011

However, there is one use case remaining (fill_via_chunks) where we could not make the safe code as fast as the unsafe code.

Conversion of fill_via_chunks to safe code has been attempted in rust-random/rand#1011, see that PR for more info.