v0.17.0 dataframe_to_tree_by_relation TypeError str and float
nextgenusfs opened this issue · comments
Describe the issue
A clear and concise description of what the bug is.
Environment
Describe your environment.
- Platform: centos7
- Python version: 3.10.14
bigtree
version: 0.17.0
I'm seeing changes in the behavior of dataframe_to_tree_by_relation
. This code was working in previous versions (ie v0.16.4 it works without error):
# rd is a pandas dataframe
>>> rd.shape
(412, 7)
>>> rd.columns
Index(['assembly', 'illumina', 'ont', 'parent', 'seqread', 'strain', 'talias'], dtype='object')
>>> root = dataframe_to_tree_by_relation(rd, parent_col="parent", child_col="strain", attribute_cols=["talias", "seqread", "assembly"])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/envs/vtools/lib/python3.10/site-packages/bigtree/tree/construct.py", line 947, in dataframe_to_tree_by_relation
f"Unable to determine root node\nPossible root nodes: {sorted(root_names)}"
TypeError: '<' not supported between instances of 'str' and 'float'
I'm not sure which column of the DataFrame this is referring to?
Ah there was some changes and fixes with dataframe operations in the latest version. Is it possible to share what does rd
contain to recreate the error?
Edit: Can you check if your child_col (strain
) is of datatype string? It could be that some strain
is str and some is float.
Hi, thanks for raising this issue! The fix is implemented in v0.17.1, do upgrade bigtree with the command pip install --upgrade bigtree
.