jordanbray / chess

A rust library to manage chess move generation

Home Page:https://jordanbray.github.io/chess/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Improper use of MaybeUninit in code is undefined behaviour

analog-hors opened this issue · comments

The documentation of MaybeUninit::as_mut_ptr mentions that

Reading from this pointer or turning it into a reference is undefined behavior unless the MaybeUninit<T> is initialized.

This is exactly what happens in all usages of MaybeUninit throughout the library:

chess/src/board.rs

Lines 864 to 868 in ad13857

let mut result = mem::MaybeUninit::<Board>::uninit();
unsafe {
self.make_move(m, &mut *result.as_mut_ptr());
result.assume_init()
}

let mut bresult = mem::MaybeUninit::<Board>::uninit();
unsafe {
board.make_move(m, &mut *bresult.as_mut_ptr());
result += MoveGen::movegen_perft_test(&*bresult.as_ptr(), depth - 1);
}

let mut bresult = mem::MaybeUninit::<Board>::uninit();
unsafe {
board.make_move(x, &mut *bresult.as_mut_ptr());
result += MoveGen::movegen_perft_test(&*bresult.as_ptr(), depth - 1);
}

let mut bresult = mem::MaybeUninit::<Board>::uninit();
unsafe {
board.make_move(x, &mut *bresult.as_mut_ptr());
result += MoveGen::movegen_perft_test(&*bresult.as_ptr(), depth - 1);
}

commented

Board contains

  • Option<Square> (which is an Option<u8>)
  • Color, which an enum with two variants (kinda like bool)

It is very dangerous to deref this structure without initializing it
Rust assumes that Color will be either Black or White
Or the tag from Option<u8> will be always either Some or None
Also, LLVM can just mark the structure with undef and yeet out all the code that uses it, just because it is UB

EDIT: Also CastleRights is an enum
There shouldn't be a real performance impact on initializing a ~100 bytes structure, the cost of code being (potentially) completely broken I think just outweighs it

There are several unsound issues here that I will take a look at this week. Don't view my lack of comments on the other issues as meaning I have ignored them.

The performance impact of initializing the Board structure is actually pretty measurable. Every instance of unsafe in this codebase is backed up by performance numbers.

That's not to say (or indicate) that the safety issue should be ignored. It shouldn't. But, the goal of this library is to generate moves as fast as possible, and this library exists in that environment.

@Soveu I was wrong. You said:

There shouldn't be a real performance impact on initializing a ~100 bytes structure, the cost of code being (potentially) completely broken I think just outweighs it

I responded:

The performance impact of initializing the Board structure is actually pretty measurable. Every instance of unsafe in this codebase is backed up by performance numbers.

I was wrong. Well, I was right. The compiler seems to have gotten better. By my tests, the compiler optimizes this correctly as of v 1.45. As a result, the board.make_move() function can be deprecated in favor of board.make_move_new().

I will need to verify some of this a bit more.

EDIT: by that I mean, board.make_move_new() can just use a Board::new() to do it's work, not an uninitialized board I believe.

commented

Well now I have similar problem when working on hyper - passing uninitialized data to a function that just writes.
Here I would suggest "copy-pasting" fn make_move to some sort of fn make_move_uninit that accepts &mut MaybeUninit<Board> and promise no garbage data is written.
The make_move would look like this

pub fn make_move(&self, m: ChessMove, result: &mut Board) {
  // SAFETY: make_move_uninit promises to not write garbage data
  unsafe {
    let result: *mut Board = result;
    let result = result as *mut MaybeUninit<Board>;
    return make_move_uninit(self, m, &mut *result);
  }
}

I could do that, but perhaps a better way would be to copy-paste the contents of make_move into make_move_new.

So, the new version would look like this:

pub fn make_move_new(&self, m: ChessMove) -> Board {
    let mut next_pos = Board::new();

    // Contents of make_move

   next_pos
}

pub fn make_move(&self, m: ChessMove, result: &mut Board) {
    *result = self.make_move_new(m);
}

The reason it isn't written this way currently, is because at the end of make_move_new, rust was copying the position, even though there's no reason for a copy there. This is an optimization called named return-value optimization, and is never promised (in any language).

However, it does seem that current rust compilers do have support for this. By my (very lazy) experiments, it seems to handle this sort of thing around 1.45.

In addition, the Board::new() in my code above should be effectively optimized out, as I believe the compiler will eliminate the unnecessary writes that happen there.

I still need to fact-check a lot of that by looking at the assembly a bit more. I only ran it on some simpler examples, not on this specific codebase while investigating, but at the moment I believe everything I have said is sound. I reserve the right to change my mind if I'm wrong, lol.

commented

This has been completed.