Control over children ordering
Drakulix opened this issue · comments
I am wondering if any guarantees on the order of children would be a goal or non-goal of this library?
It seems insert_with_parent
currently only pushes to the end of the children list.
If I would like to replace a child and keep the current position, this does not seem to be easily possible and I don't even know, if I can rely on insert_with_parent
always appending at the end, because this is not documented.
For my use case I just have a left and a right child and it would be great to be able to differentiate between those. But I have no good idea, how to provide this functionality in the context of this library with an arbitrary amount of child nodes.
Possible API suggestions:
- Split insert into two functions push at the front and back of the children respectively.
- Not a very nice API in my opinion. The use case is not very obvious at first.
- Actual sorting would be difficult to emulate given only these functions
- Enough for my use case
- Easy to implement
- Allow to sorting of children via a closure
- Makes a good API
- Can be used to reorder children, after "replacing" (remove/insert), but needs to track old positions manually
- That means not very nice for my use case
- Easy to implement (https://doc.rust-lang.org/stable/std/vec/struct.Vec.html#method.sort_by_key)
- Allow swapping of children
- Also not a very nice API. The use case is not very obvious at first.
- Actual sorting would be difficult to emulate given only these functions
- Enough for my use case
- Easy to implement
- Allow replacement of
Node
s and swapping ofNode
s.- Swapping could take two
NodeId
s - Replacing could insert a new node, removing the old in the process.
- Still does not make very strong order guarantees except to keep the old order
- Feels like it has use cases beyond this
- Swapping could take two
I mostly opened this issue to discuss, if you would want to support any order guarantees anyway. The rest is just a little bit of brainstorming, please feel free to edit/extend/ignore it, however you like.
Thanks again for submitting an issue, all of this is very much appreciated!
You are absolutely correct in saying that insert_with_parent
only ever pushes to the end of the child array. However, this is just the way I happened to implement the system, so I didn't originally mean for there to be any guarantees there.
That's not to say that I'm against having any guarantees about child ordering, it's just that I haven't made any explicit efforts to guarantee any specific behavior so far. As you said, this is not documented anywhere, so that should probably be fixed at some point (hopefully it will be fixed shortly after we hash this out because I think this is a discussion that is definitely needed).
Right now I'm leaning towards having insert_with_parent
's behavior stay the way it is and documenting the fact that new child Node
s will always be inserted "after" existing children. I think this behavior is nice for several reasons (some of which may be debatable):
- I think this behavior is what most people would expect from such a function
- It should be fast (assuming that the underlying Vec doesn't need to re-allocate space)
- It's a pretty simple implementation
Basically, I want the ordering to be guaranteed not to change unless the caller explicitly asks for it to change. This is similar to how the caller shouldn't need to worry about NodeId
s becoming invalid unless the caller explicitly clone
s a NodeId
(and then proceeds to remove the corresponding Node
from the Tree
).
With that in mind, I agree it would be very nice to be able to sort the children of a Node
, so I'm thinking we definitely need something like sort_children_by_key
/sort_children_by
(maybe both?). I did have one question on this one though: could you clarify what you mean by '...but needs to track old positions manually'?
I do also think it would be very nice to have a replace
function and a swap
function as I can imagine those could be very useful in certain scenarios.
What do you think about the above approach? I know I basically just responded with "I like all of those, lets do all of them", but I think most of those functions are things that people would expect from a tree library, so they'll be nice to have.
With that in mind, I agree it would be very nice to be able to sort the children of a Node, so I'm thinking we definitely need something like sort_children_by_key/sort_children_by (maybe both?). I did have one question on this one though: could you clarify what you mean by '...but needs to track old positions manually'?
That is just relevant to my use case. If you want to keep the insertion order and replace a child with just the sort functions available, you would have to remember the insertion order prior to making the modifications to restore it later. A real replace function would be a much better solution.
Alright, I am going to implement replace
and swap
, as well as both sort
methods in the next days, maybe even today. I don't think there is anything wrong with exposing both functions.
If you want to keep the insertion order and replace a child with just the sort functions available, you would have to remember the insertion order prior to making the modifications to restore it later. A real replace function would be a much better solution.
Ah, I gotcha, that makes sense. And yes, I agree, a real replace
function would be better.
Alright, I am going to implement replace and swap, as well as both sort methods in the next days, maybe even today. I don't think there is anything wrong with exposing both functions.
Awesome, I appreciate it! I think those will be great additions!
I would like to request that you make sure that both parent
and children
values get cleared out on the Node
that is removed during replace
. Just to help make sure that they don't live longer than they should on accident.
If any questions/concerns come up while you're working on those please feel free to ask.
I would like to request that you make sure that both
parent
andchildren
values get cleared out on theNode
that is removed duringreplace
. Just to help make sure that they don't live longer than they should on accident.
Sure. I will use the existing functions as a reference.
I am running into some problems implementing sort_by
or sort_by_key
.
I cannot directly use self.children_mut().sort_by(f)
, because this only returns NodeId
s. But if you can call sort on a Node
, you must have acquired this node by using tree.get_mut()
, which means the Tree
is already borrow mutably and you cannot use it inside the closure to get a Node
for the NodeId
passed onto you.
I am currently thinking the only way to workaround this, is to implement the sorting methods on Tree
directly.
Do you have an opinion on how to implement this?
My first thought is to add a method like this to Tree
:
pub fn sort_children_by_data(&mut self, node_id: &NodeId) -> Result<(), NodeIdError> where T: Ord {
let (is_valid, error) = self.is_valid_node_id(node_id);
if !is_valid {
return Result::Err(error.expect("Tree::sort_children_by_data: Missing an error value on finding an invalid NodeId."));
}
let mut children = self.get_unsafe(node_id).children().clone();
children.sort_by_key(|a| {
self.get_unsafe(a).data()
});
//set_children is a new (private) method.
self.get_mut_unsafe(node_id).set_children(children);
Result::Ok(())
}
Notes on this:
- Sadly, this approach requires a clone.
- We would need a new method on the
MutableNode
Trait calledset_children
for this approach. - Not sure how hard it is to allow a custom closure to be passed in, but this one is obviously hard-coded to compare
Node::data()
directly which requiresT: Ord
.
Again, this is just my first thought on how I would have done it, but there may be a better way to go about it.
Any thoughts on this? Does that help at all?
EDIT: I did run that and it does type-check properly (when I removed the call to set_children
since that doesn't exist yet).
Sure it does.
Passing a closure should not be a huge problem.
Any reason, why you don't use children_mut()
inside this function, now that there is the possibility of mutable access to the node? That would mean we need no clone
and no set_children
.
I will add those methods to the tree directly then and try to work around the cloning as described.
Well, originally I had this:
pub fn sort_children_by_data(&mut self, node_id: &NodeId) -> Result<(), NodeIdError> where T: Ord {
let (is_valid, error) = self.is_valid_node_id(node_id);
if !is_valid {
return Result::Err(error.expect("Tree::move_node_to_parent: Missing an error value on finding an invalid NodeId."));
}
let mut children = self.get_mut_unsafe(node_id).children_mut();
children.sort_by_key(|a| {
self.get_unsafe(a).data()
});
Result::Ok(())
}
and I got this error:
error[E0502]: cannot borrow `self` as immutable because `*self` is also borrowed as mutable
--> src\tree.rs:455:30
|
453 | let mut children = self.get_mut_unsafe(node_id).children_mut();
| ---- mutable borrow occurs here
454 |
455 | children.sort_by_key(|a| {
| ^^^ immutable borrow occurs here
456 | self.get_unsafe(a).data()
| ---- borrow occurs due to use of `self` in closure
...
460 | }
| - mutable borrow ends here
So that's why I opted for the immutable borrow on the node (note get_mut_unsafe
vs get_unsafe
) + clone the children idea.
But maybe there's another way to approach it that I'm just not seeing right now.
Just ran into this problem as well.
This is a little frustrating, because we know, that no children will be the same as node_id
.
There is no easy way to work around this. self.nodes.split(n)
might be used to get those separately without cloning, but I think for the readability of the source code, we should go with clone
instead.
Yeah, I agree it's pretty frustrating.
But I think you're right though, going with the clone
approach is probably best.
Just a though:
We could add another method to MutableNode
called take_children
, that uses mem::swap
to exchange the children with an empty Vec
, which could be initialized using Vec::with_capaticy(0)
or even mem::uninitialized
, if you would want to go with unsafe
code.
That way we could manipulate the children directly without keeping a mutable reference to the node by taking ownership.
An empty Vec
has probably a smaller performance impact than the cloning, but it lowers the readability of the code.
To be honest I try to avoid unsafe
wherever possible, but I'm sure you've seen that I've already used unsafe
a few times to avoid bounds checking (since we're doing the bounds checking already).
With that said, (and if I'm following your line of logic properly) I think this is probably a good idea.
Just to make sure I'm following you on this:
- add
set_children
toMutableNode
- add
take_children
toMutableNode
- in
sort_by_[whatever]
calltake_children
, sort the vec, and then callset_children
Is that correct?
Also, from the docs here:
In particular, if you construct a Vec with capacity 0 via Vec::new(), vec![], Vec::with_capacity(0), or by calling shrink_to_fit() on an empty Vec, it will not allocate memory.
So mem::uninitialized
might be overkill in this case?
Yes that is correct.
And that indeed sounds like an overkill in that case. I was trying to avoid a heap allocation of Vec
, but if it does none in the first place mem::uninitialized
just skips the initializer, which should not have a huge impact on performance.