tf-encrypted / aes-prng

Rust pseudo-random number generator based on AES

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Investigate simple performance improvement for Apple M1

mortendahl opened this issue · comments

AES seems to be slower than ChaCha on Apple M1:

rng_fill/chacha8/2000000
                        time:   [1.6929 ms 1.6940 ms 1.6951 ms]
Found 9 outliers among 100 measurements (9.00%)
  2 (2.00%) low severe
  1 (1.00%) low mild
  2 (2.00%) high mild
  4 (4.00%) high severe
rng_fill/chacha12/2000000
                        time:   [2.4556 ms 2.4581 ms 2.4606 ms]
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild
rng_fill/chacha20/2000000
                        time:   [3.9821 ms 3.9857 ms 3.9895 ms]
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild
rng_fill/aes/2000000    time:   [8.4624 ms 8.4707 ms 8.4792 ms]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

rng_next_u64/chacha8    time:   [8.0137 us 8.0212 us 8.0285 us]
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  4 (4.00%) high severe
rng_next_u64/chacha12   time:   [11.055 us 11.065 us 11.076 us]
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  2 (2.00%) high severe
rng_next_u64/chacha20   time:   [17.142 us 17.161 us 17.179 us]
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild
  1 (1.00%) high severe
rng_next_u64/aes        time:   [36.918 us 36.950 us 36.983 us]
Found 10 outliers among 100 measurements (10.00%)
  5 (5.00%) low mild
  4 (4.00%) high mild
  1 (1.00%) high severe

Maybe this can be fixed by simply enabling a flag.

Note that (see https://docs.rs/aes/0.8.1/aes/#configuration-flags)

$ RUSTFLAGS="--cfg aes_armv8" cargo +nightly bench

gives much better results:

rng_fill/chacha8/2000000
                        time:   [1.5236 ms 1.5249 ms 1.5263 ms]
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) low mild
  2 (2.00%) high mild
  2 (2.00%) high severe
rng_fill/chacha12/2000000
                        time:   [2.2282 ms 2.2300 ms 2.2320 ms]
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild
rng_fill/chacha20/2000000
                        time:   [3.6580 ms 3.6617 ms 3.6656 ms]
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild
rng_fill/aes/2000000    time:   [225.40 us 225.76 us 226.14 us]
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

rng_next_u64/chacha8    time:   [6.2238 us 6.2301 us 6.2366 us]
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
rng_next_u64/chacha12   time:   [9.0791 us 9.0910 us 9.1064 us]
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) high mild
  5 (5.00%) high severe
rng_next_u64/chacha20   time:   [14.792 us 14.805 us 14.818 us]
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) low mild
  4 (4.00%) high mild
  2 (2.00%) high severe
rng_next_u64/aes        time:   [1.9284 us 1.9315 us 1.9346 us]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) low mild