Join forces with anytree
lverweijen opened this issue · comments
Would it be an idea to join forces with anytree?
I see a lot of overlap between both projects and perhaps a merger can combine the best of both worlds.
Since both have a MIT license, it might not even be that difficult.
One difference is that bigtree depends on pandas, whereas anytree is python-pure.
Maybe pandas can be made optional or even dropped if there is no performance difference.
Projects:
@lverweijen Bigtree is really cool and maintained. Feature-wise it outshines anytree. At this point, it's a bit too late maybe as i see anytree as a subset of bigtree.
@Abdur-rahmaanJ bigtree has more features (I think), but anytree's code / design looks more fleshed out.
I was thinking of maybe adding to anytree what's missing there, but already present in bigtree.
Something I would like anytree to have is more options for importing / exporting to different formats.
Since the apis are similar enough, it shouldn't be that hard to port features back and forth, but I would hope for a collaboration.
edit: I originally wrote "anytree has more features". I meant "bigtree" has more. Although, they both have features lacking from the other.
Here are some differences:
- anytree has NodeMixin, bigtree has BaseNode. NodeMixin is slightly more flexible, because it doesn't require a name attribute.
- anytree has SymlinkNode. It's like a shallow copy of a Node.
- anytree has separator as class attribute. bigtree has sep on root. I prefer class attribute, because of O(1) lookup time.
- anytree is python only, bigtree has a dependency on pandas. Having it as an optional dependency is preferable.
- anytree has resolver. bigtree.finds_paths comes close, but doesn't support wildcards.
- anytree has ZigZagGroupIter. Not sure if it's needed.
- bigtree has type annotations in the codebase. These are missing from anytree.
- bigtree has BinaryNode. Basically a node with at most 2 children.
- bigtree has DAGNode. A node with multiple parents.
- bigtree has import / export to / from list, dict, nested dict, dataframe and list. anytree only supports json and dicts.
- bigtree has bulk modification functions (shift_nodes/copy_nodes). Not sure what benefits they have over modifying nodes directly.
- bigtree has workflows. They seem a bit too specific to include in the library itself. Maybe include them in documentation/examples instead.
There are a few ways to continue from:
- Add one project to the other. So either anytree should consume bigtree or bigtree should consume anytree.
- Start a new project that is the successor to both, to which both source packages can contribute. About individual differences a discussion can be started.
- The projects should gradually grow towards each other, copying features until they are exactly the same and maybe merge over time.
Hello, thanks for your comprehensive comparison! To address your points,
BaseNode
in bigtree does not require name attribute, which I would think it is similar toNodeMixin
as it is easily extendable. Similarly, I don't see a need forSymlinkNode
because in any case, users can just copy or extendNode
for their usecase, unless I have understood the usage and purpose ofSymlinkNode
wrongly. Examples of how to extendNode
can be found in the documentation.separator
/sep
should be consistent for the whole tree, which should not be implemented as a class attribute for each node. If you do this on anytree, you will notice the issue,
from anytree import Node
a = Node("a")
b = Node("b", parent=a)
b.separator = "-"
b
# Node('-a-b', separator='-')
a
# Node('/a')
- Making
pandas
an optional library has been raised as an issue previously, you can refer to the issue here. - Resolver is interesting! This can be a future enhancement 👍
- I didn't see a need for ZigZagGroupIter, but this can be a possible future enhancement as well!
Moving forward, I'd be happy to continuously enhance and fix bigtree
, do continue to raise issues as well. Thanks for your support on bigtree
!
separator/sep should be consistent for the whole tree, which should not be implemented as a class attribute for each node. If you do this on anytree, you will notice the issue
You are right about that. If using a class attribute, it can perhaps be prevented by using type(self).separator
instead.
If you use type(self).separator
, you are using the separator of anytree.node.node.Node
class which is always /
. You solve the issue on separator discrepancy, but you sacrifice on customizability; users cannot choose their own separator
since it is referencing the default Node
class.
Given these concerns, I would advise against using class attribute and have the separator
/sep
synced and consistent yet customizable for the whole tree, which is why I chose to implement it as a class property referencing root node's sep
.
The following enhancements has been made available on bigtree
v0.10.0
find_relative_path
: Similar to Resolver from anytree, able to find relative path with.
/..
/*
notationszigzaggroup_iter
: Similar to ZigZagGroupIter from anytreezigzag_iter
: Zig Zag iteration, not present in anytree
Do upgrade bigtree with pip install --upgrade bigtree
to get the latest changes! 😄
I will be closing this ticket as well, if there are any new features, enhancements, or bugfix, do raise another issue.
Update: Pandas is now an optional library in v0.12.0
!
Thanks for all your support and suggestions in making bigtree
better 😄