Unifurcating root causes ` QuartetDistance`, `TripletDistance` error
mmore500 opened this issue · comments
Trees with unifurcating roots cause QuartetDistance
and TripletDistance
to abort with an error. Here's a minimum working example of the issue
> library('Quartet')
Loading required package: TreeTools
Loading required package: ape
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Pop!_OS 22.04 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Quartet_1.2.5 TreeTools_1.9.1 ape_5.7-1
loaded via a namespace (and not attached):
[1] Rcpp_1.0.10 magrittr_2.0.3 bit_4.0.5 viridisLite_0.4.1
[5] xtable_1.8-4 colorspace_2.1-0 R.cache_0.16.0 lattice_0.20-45
[9] R6_2.5.1 rlang_1.1.0 fastmap_1.1.1 fastmatch_1.1-3
[13] Ternary_2.1.3 tools_4.1.2 rbibutils_2.2.13 parallel_4.1.2
[17] grid_4.1.2 nlme_3.1-162 R.oo_1.25.0 cli_3.6.1
[21] ellipsis_0.3.2 htmltools_0.5.5 bit64_4.0.5 digest_0.6.31
[25] lifecycle_1.0.3 shiny_1.7.4 later_1.3.0 promises_1.2.0.1
[29] R.utils_2.12.2 Rdpack_2.4 mime_0.12 sp_1.6-0
[33] compiler_4.1.2 R.methodsS3_1.8.2 httpuv_1.6.9
> QuartetDistance("newick1.tree.txt", "newick2.tree.txt")
Error: Leaves don't agree: a tip in tree 2 didn't exist in first tree! Aborting.
> QuartetDistance("newick1.tree.txt", "newick1.tree.txt")
[1] 0
> QuartetDistance("newick2.tree.txt", "newick2.tree.txt")
[1] 0
((8,3),(5,6));
(((3,8),(5,6)));
Looks like this is an upstream issue with the tqdist library this package wraps, as I am able to reproduce it directly with tqdist 1.0.2. I've contacted the author of tqdist, who wrote back and is looking into the bug. Just wanted to leave this note here for now in case anyone else runs into this issue. I'll provide an update when I hear further from Christian.
In case it's useful, here are some more details about the bug I shared with Christian.
The bug can be reproduced from these two example files.
tree1.newick
(((3,8),((5,6))));
and
tree2.newick
((3,8),((5,6)));
tree1 and tree2 are identical, except tree1 has a tacked-on root unifurcation.
Comparing the trees gives
$ quartet_dist -v tree1.newick tree2.newick
Leaves doesn't agree! Aborting! ( didn't exist in second tree)
The two trees do not have the same set of leaves.
Aborting.even though comparing each to itself works as it should
$ quartet_dist -v tree2.newick tree2.newick
4 1 0 0 1 1 0 0$ quartet_dist -v tree1.newick tree1.newick
5 5 0 0 5 1 0 0Interestingly, non-root unifurcations don't seem to cause any issues as far as I can tell. The issue isn't reproducible across all root unifurcations. For example, if tree2 is "(((3,(5)),6,8));" comparison with tree1 works fine. However, comparing "(((3,(5)),6,8));" to "((3,(5)),6,8);" has the bug. The same issue arises with triplet_dist, too.
Thanks for the report; do keep me posted and I'll propagate any updates to tqDist to the Quartet package.