kyren / gc-arena

So I saw the comment here:

Lines 162 to 177 in c70f838

    
           // Okay, so this calls `T::trace` on a *copy* of `T`. 
        
           // 
        
           // This is theoretically a correctness issue, because technically `T` could have interior 
        
           // mutability and modify the copy, and this modification would be lost. 
        
           // 
        
           // However, currently there is not a type in rust that allows for interior mutability that 
        
           // is also `Copy`, so this *currently* impossible to even observe. 
        
           // 
        
           // I am assured that this requirement is technially "only" a lint, and could be relaxed in 
        
           // the future. If this requirement is ever relaxed in some way, fixing this is relatively 
        
           // easy, by setting the value of the cell to the copy we make, after tracing (via a drop 
        
           // guard in case of panics). Additionally, this is not a safety issue, only a correctness 
        
           // issue, the changes will "just" be lost after this call returns. 
        
           // 
        
           // It could be fixed now, but since it is not even testable because it is currently 
        
           // *impossible*, I did not bother. One day this may need to be implemented!

I was wondering why this impl needs T to implement Copy in the first place? I get that it is common for Cell contents (and thus Locks) to implement Copy, but I'm not sure why that is needed here for Lock to safely implement Collect.

The trace() method can still be implemented using Cell::as_ptr():

diff --git i/src/gc-arena/src/lock.rs w/src/gc-arena/src/lock.rs
index c2f0156..fd5cd31 100644
--- i/src/gc-arena/src/lock.rs
+++ w/src/gc-arena/src/lock.rs
@@ -151,7 +151,7 @@ impl<'gc, T: Copy + 'gc> Gc<'gc, Lock<T>> {
     }
 }

-unsafe impl<'gc, T: Collect + Copy + 'gc> Collect for Lock<T> {
+unsafe impl<'gc, T: Collect + 'gc> Collect for Lock<T> {
     #[inline]
     fn needs_trace() -> bool {
         T::needs_trace()
@@ -159,23 +159,7 @@ unsafe impl<'gc, T: Collect + Copy + 'gc> Collect for Lock<T> {

     #[inline]
     fn trace(&self, cc: &Collection) {
-        // Okay, so this calls `T::trace` on a *copy* of `T`.
-        //
-        // This is theoretically a correctness issue, because technically `T` could have interior
-        // mutability and modify the copy, and this modification would be lost.
-        //
-        // However, currently there is not a type in rust that allows for interior mutability that
-        // is also `Copy`, so this *currently* impossible to even observe.
-        //
-        // I am assured that this requirement is technially "only" a lint, and could be relaxed in
-        // the future. If this requirement is ever relaxed in some way, fixing this is relatively
-        // easy, by setting the value of the cell to the copy we make, after tracing (via a drop
-        // guard in case of panics). Additionally, this is not a safety issue, only a correctness
-        // issue, the changes will "just" be lost after this call returns.
-        //
-        // It could be fixed now, but since it is not even testable because it is currently
-        // *impossible*, I did not bother. One day this may need to be implemented!
-        T::trace(&self.get(), cc);
+        unsafe { T::trace(&*self.cell.as_ptr(), cc) }
     }
 }

Some more context for where I came across this issue, in case that's helpful for answering my question:

I'm trying to implement a runtime based on 2-word nodes, that looks something like this:

#[repr(align(4))]
pub struct Node<'gc>(Lock<NodeInner<'gc>>); // Lock wanted for interior mutability

struct NodeInner<'gc>(Word<'gc>, Word<'gc>);

union Word {
  word: usize,                   // also used to store type tag, with LSB = 1
  ptr: Gc<'gc, Node<'gc>>,       // managed node pointer
  string: ManuallyDrop<ThinStr>, // owned string
  int: isize,                    // by value
}

const _: () = assert!(mem::size_of::<Word>() == mem::size_of::<usize>());

unsafe impl<'gc> Collect for NodeInner<'gc> { .. }
impl<'gc> Drop for NodeInner<'gc> { .. }

I use a bit-stealing scheme based on this Haskell runtime. The first word is usually used to hold a type tag, e.g., Tag::Int or Tag::String, and the second word carries the payload. But for "app" nodes, which hold two pointers, the NodeInner simply contains two Gc pointers. The tag values are chosen such that LSB is always 1, which we can use to determine whether we're staring at two pointers, a type tag + some kind of payload, which may or may not be pointer.

I need a custom impl Drop for NodeInner because I'm using bit-stealing to implement what is logically an enum; for the same reason, I also need a custom Collect implementation.

I'm quite hesitant to implement Copy (or even Clone) for the NodeInner because it might contain a unique pointer---e.g., to a string. (And I want to support other owned pointers too.) But that Copy trait seems necessary for me to have a Lock<NodeInner>, due to the requirements of gc-arena's impl Collect for Lock.

Taking a step back, and asking a broader question: is there something I'm missing here? I know that this kind of union design is kind of unergonomic, but is it fundamentally incompatible with the way gc-arena is designed? I'd really like my Nodes to be as compact as possible.

(And I know that gc-arena's GcBox allocates additional space for its GcBoxHeader, so this 2-word design would actually occupy 4 words on the heap. But I want to see if it's even possible to have this kind of a representation.)

The main issue is of safety due to Collect::trace taking &self rather than &mut self. There are several ways (such as through circular Gc pointers) you could end up providing access to both a &Cell<T> (allowing mutating Cell methods like Cell::set to be called) with a live &T to the inside of it, which would mean potential UB.

If we passed a &mut self to Collect::trace, then this problem goes away trivially (you could just call Cell::get_mut) but it is replaced by a new set of requirements on implementations of Collect::trace: that they not be allowed to access any held Gc pointers at all (similarly to the limitations on Drop impls). The reason for this new limitation is again due to circular self-references: without being very careful, we would allow both a &mut T and a &T to be alive at the same time. Having Collect::trace take &mut self will be required if we ever wish to support copying collectors anyway, and we've considered it, but it makes the safety contract for Collect even more complicated so we haven't done it yet.

I think making this change sound is exactly equivalent to having Collect::trace receive &mut self, so it should wait for that change, if / when we finally make it.

I just got back from a trip and I've had a lot of personal things happening in the last month or so, so I haven't been thinking about this for a while, @moulins should probably back me up on this (or tell me if I've gotten something wrong).

I think making this change sound is exactly equivalent to having Collect::trace receive &mut self, so it should wait for that change, if / when we finally make it.

That's my understanding as well :)

@j-hui For your specific use-case, I'd suggest implementing Copy for Word, and using a separate Lock for each field of NodeInner, like this:

#[repr(align(4))]
pub struct Node<'gc>(NodeInner<'gc>);

struct NodeInner<'gc> {
  fst: Lock<Word<'gc>>,
  snd: Lock<Word<'gc>>,
}

Then you can implement Drop on NodeInner without fear of unexpected copies, and each field's value can still be modified through the use of the unlock! macro.

Okay, one thing I forgot is that since the Cell is always wrapped in a Lock, all safe methods that provide mutation will require a Mutation context, which won't be available during tracing. However, you can still call Lock::unlock_unchecked, which might be sound in the sense of not allowing for adopting any new pointers, but would now potentially become unsound due to the circular reference problem described above. So I still think this is an issue but it's a little more subtle than I initially appreciated. I still think that this should wait until Collect::trace takes &mut self, but I admit it is an unlikely corner case to have both 1) mutating a Lock in a trace method via Lock::unlock_unchecked AND 2) back references to a Gc<Lock<T>> available inside T::trace.

Okay, one thing I forgot is that since the Cell is always wrapped in a Lock, all safe methods that provide mutation will require a Mutation context

That's not actually true, Lock::take is safe and doesn't require a &Mutation. But it is true that any code exploiting this for unsoundness would be quite contrived... :D

That's not actually true, Lock::take is safe and doesn't require a &Mutation. But it is true that any code exploiting this for unsoundness would be quite contrived... :D

Ah heck I forgot about Lock::take 😝.

Thanks so much for the really informative response and discussion. And sorry for not ACKing sooner!

For your specific use-case, I'd suggest implementing Copy for Word, and using a separate Lock for each field of NodeInner

For now, I've been using the "naive" but idiomatic solution based on regular enums etc., for the sake of expediency in my own project. But I'll try @moulins 's suggestion next---if anything, I'd be curious to see what the impact is on speed and space, compared to the naive solution.

The main issue is of safety due to Collect::trace taking &self rather than &mut self. There are several ways (such as through circular Gc pointers) you could end up providing access to both a &Cell<T> (allowing mutating Cell methods like Cell::set to be called) with a live &T to the inside of it, which would mean potential UB.

Though I understand how having both a &T and a &Cell<T> is UB, but I'm not sure how one could obtain a reference to data maintained within a Cell. (Isn't the point of a Cell to give up references in exchnage for interior mutability?) Do you have a more concrete example of what you're suggesting could happen?

Thanks so much for the really informative response and discussion. And sorry for not ACKing sooner!

No worries!

Though I understand how having both a &T and a &Cell is UB, but I'm not sure how one could obtain a reference to data maintained within a Cell. (Isn't the point of a Cell to give up references in exchnage for interior mutability?) Do you have a more concrete example of what you're suggesting could happen?

It might be pretty torturous to actually exploit, but the idea is that somewhere you have a field of type Gc<'gc, Lock<T>> right? Eventually, the Collect::trace method for Lock<T> will be called (the implementation we've been discussing), which will in turn call the Collect::trace method for T itself. This means that there is a &T around as the &self parameter to <T as Collect>::trace. The original Gc<'gc, Lock<T>> pointer could have any number of copies available anywhere, including within the held T, since unrestricted back and even self pointers are sort of the whole idea of gc-arena. So, in the <T as Collect>::trace method, you have both a pointer to T (&self) and potentially a pointer to the outer Cell<T> (inside Lock<T>). You don't have a &Mutation available, but you could safely call Lock<T>::take on the same instance that holds &self and cause UB. The existing implementation prevents UB by calling <T as Collect>::trace on a copy of the held T, and the implementation for RefLock<T> prevents the same UB by actually holding a read lock, triggering a panic if you try to do a write. If we made the described change to change the signature of Collect::trace to take &mut self, then the UB would still be there (and actually the whole situation would be dramatically more dangerous!), but the rules around Collect::trace would be tightened to forbid ever dereferencing Gc pointers at all, thus you couldn't get in trouble without first violating a much stricter rule. It's a big, dramatic change though that might get rid of some valid use cases so we haven't made that change yet.

	// Okay, so this calls `T::trace` on a copy of `T`.
	//
	// This is theoretically a correctness issue, because technically `T` could have interior
	// mutability and modify the copy, and this modification would be lost.
	//
	// However, currently there is not a type in rust that allows for interior mutability that
	// is also `Copy`, so this currently impossible to even observe.
	//
	// I am assured that this requirement is technially "only" a lint, and could be relaxed in
	// the future. If this requirement is ever relaxed in some way, fixing this is relatively
	// easy, by setting the value of the cell to the copy we make, after tracing (via a drop
	// guard in case of panics). Additionally, this is not a safety issue, only a correctness
	// issue, the changes will "just" be lost after this call returns.
	//
	// It could be fixed now, but since it is not even testable because it is currently
	// impossible, I did not bother. One day this may need to be implemented!

Why does impl Collect for Lock need Copy?