dataframe_to_tree doesn't work with path_col not the first column in dataframe
dkadish opened this issue · comments
When attempting to use path_col
in the dataframe_to_tree
function the function fails if path_col
is not the first column in the dataframe.
Example:
import bigtree as bt
print(bt.__version__)
'0.9.3'
d = pd.DataFrame([
['a/b/c', 'c/e/f'],
['a/c', 'c/g'],
['a', 'c/h']
],
columns=['one','two'])
print(d)
tree = bt.dataframe_to_tree(d, 'one')
bt.print_tree(tree)
one two
0 a/b/c c/e/f
1 a/c c/g1
2 a c/h
a
├── b
│ └── c
└── c
tree = bt.dataframe_to_tree(d, 'two')
bt.print_tree(tree)
TreeError: Error: Path does not have same root node, expected c, received a. Check your input paths or verify that path separator
sep
is set correctly
# Move 'two' to first column
d = d.assign(three=d.one).drop(columns='one')
print(d)
tree = bt.dataframe_to_tree(d, 'two')
bt.print_tree(tree)
two three
0 c/e/f a/b/c
1 c/g a/c
2 c/h a
c
├── e
│ └── f
├── g
└── h
The issue arises here, where the code does not pass the path_col
parameter to the add_dataframe_to_tree_by_path
function.
Hello, thanks for raising this issue and submitting the PR! This fix is implemented in v0.9.4, do upgrade bigtree
for the latest changes.