Amanieu / intrusive-rs

Intrusive collections for Rust

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Great crate, major hole in it ;-)

przygienda opened this issue · comments

Crate is great. Very common problem though in this kind of stuff. I insert into rbtree, then use the link list to change keys, tree ends up corrupted since it seems to pull local copies of keys (basically invariants change). Adapters would need to somehow let each other know when the key is being modified or some other kind of API fix to prevent this.

// get bunch elements into tree/lists & then flip keys & see whether sorting on the
// tree holds

	let mut a = LinkedList::new(MyAdapter::new());
	let mut b = SinglyLinkedList::new(MyAdapter2::new());
	let mut c = RBTree::new(MyAdapter3::new());

	for v in &[30, 40, 50, 60, 70, 80, 90, 100] {
		let mut test = Rc::new(Test {
			value: Cell::new(*v),
			..Test::default()
		});
		a.push_front(test.clone());
		b.push_front(test.clone());
		c.insert(test);
	}

	{
// Find the first element which is greater than or equal to min
		let mut cursor = c.lower_bound_mut(Bound::Included(&40));

		let mut cont = true;
		// Iterate over all elements in the range [min, max]
		while !cursor.is_null() {
			let v = cursor.get();
			println!("{:?}", v);

			cursor.move_next();
		}
	}

	let mut v = 0xff; // pseudo rand

	let deft = Test {
		value: Cell::new(99),
		..Test::default()
	};

	// divide by ten
	let mut mc = a.cursor_mut();

	mc.move_next();

	while ! mc.is_null() {
		let iv = &mc.get().unwrap_or(&deft).value;

		if let Some(imc) = mc.get() {
			println!("{} -> {}", iv.get(), iv.get() + v);
			imc.value.set(iv.get() + v);
		}

		v ^= iv.get();
		mc.move_next();
	}

	{
		let mut cursor = c.lower_bound_mut(Bound::Included(&40));

		let mut cont = true;
		// Iterate over all elements in the range [min, max]
		while !cursor.is_null() {
			let v = cursor.get();
			println!("{:?}", v);

			cursor.move_next();
		}
	}

Unfortunately I don't think it is possible to fix this hole reliably. Note that this isn't unique to this crate. If you look at the documentation of BTreeMap in the standard library, the last paragraph of the overview says this:

It is a logic error for a key to be modified in such a way that the key's ordering relative to any other key, as determined by the Ord trait, changes while it is in the map. This is normally only possible through Cell, RefCell, global state, I/O, or unsafe code.

Our documentation says something similar:

Note that you are responsible for ensuring that the elements in a RBTree remain in ascending key order. This property can be violated, either because the key of an element was modified, or because the insert_before/insert_after methods of CursorMut were incorrectly used. If this situation occurs, memory safety will not be violated but the find, upper_bound, lower_bound and range may return incorrect results.

The good news is that, like BTreeMap, RBTree does not rely on Ord to maintain memory safety. The only negative impact is that search may return the wrong element and the elements won't be sorted in increasing key order.

ok, understood. It's a very major pain in all languages/designs pursuiting this "intrinsic collection pointer" stuff (albeit it's often used due to memalloc efficiency). So that's why I'm stretching ;-)

Freewheeling thinking here: Since you have such clean adapters on top, couldn't we force the key to be something like recursive RWLock and the tree when element is inserted RLocks the key? When removed from tree it unlocks the rlock. only way to mod the key would be to remove the element from all trees & then WLock the key ... Lock could be hidden as abstract interface that allows for rlock() & wlock_scope(). trying to insert holding wlock into a collection would fail on "cannot read lock key" ...

That wouldn't actually help since you could use a custom type as a key which returned a random result in its Ord implementation. In the end, I think it's just too difficult or too expensive to enforce that keys are not changed, so I prefer simply leaving the responsibility to the user.

hmm, ok but I don't follow the argument fully. if someone misimplements the Ord then nothing can be done & even normal tree won't work. Same as Hash changing and normal collections.

So, no interest in MisorderFreeRBTree ;-) ?

From long experience on large codebases the changing key problem is one of the hardest bugs to track, resolve and can be trivially introduced by someone fresh working on the code & not understanding all the collections and their invariants. Trees are not that bad but it gets even harder once you have e.g. ordered queues and someone changes keys. Having somehow the concept of a "mod-safe-key" that can only be mucked around once the struct is off all collections would go a long way to sell thjis package iME ;-)

Hmm, actually after thinking about this a bit, it might be possible to add this as an optional feature. We could add the following methods to KeyAdapter:

    fn lock_key(&self, value: &'a Self::Value) {}
    fn unlock_key(&self, value: &'a Self::Value) {}

They will be called before an object is inserted into the tree and after it is removed. By default they do nothing, but you can use with a custom key wrapper type to prevent it from being modified.

Thoughts?

Ah no, it's very simple actually: we can just make ModSafeKey<T> deref to T. Keep in mind that the locking (effectively just a reference count on the number of trees that the key in) only serves to prevent writing. You can always read the key.

Regarding a whole separate ModSafeRBTree, I really don't want to do that because it's a lot of code duplication for a very small change. Just allowing custom hooks in KeyAdapter should be enough.

Note that the key can only be modified if it is inside a Cell or a RefCell. An easy way to ensure the key can't be modified while it is used in a tree is to simply not put it in a cell type. Rust's lifetime system will then guarantee that the value is never changed.