emmanuelparadis / ape

analysis of phylogenetics and evolution

Home Page:http://ape-package.ird.fr/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

njs, bionj, and bionjs sometimes cause RStudio to abort when run on only 2 sequences

FreekManders opened this issue · comments

Hi Emmanuel, thank you for the package.
When I run njs, bionj, or bionjs on a distance matrix calculated on only two sequences it sometimes causes RStudio to abort.
Other times it throws an uninformative error, while at the same time returning a tree.

Error in if (tabulate(phy$edge[, 1])[ntips + 1] > 2) FALSE else TRUE : 
  missing value where TRUE/FALSE needed

While it doesn't make sense to create a tree on such a distance matrix, it should not cause RStudio to crash.
The nj function does give an error with a helpful message:

Error in nj(my_dist) : 
  cannot build an NJ tree with less than 3 observations

It might be useful to update the first three functions to throw a similar error as nj.

Example:

library(ape)
my_dist = structure(0.2, Size = 2L, Labels = c("a", "b"),
Upper = FALSE, Diag = FALSE, method = "K80", class = "dist")
nj(my_dist) #Clear error
bionjs(my_dist)  #Less clear error / Sometimes crashes RStudio
bionj(my_dist)  #Less clear error / Sometimes crashes RStudio
njs(my_dist) #Less clear error / Sometimes crashes RStudio

Session info:

R version 4.2.3 (2023-03-15 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=English_Netherlands.utf8  LC_CTYPE=English_Netherlands.utf8    LC_MONETARY=English_Netherlands.utf8
[4] LC_NUMERIC=C                         LC_TIME=English_Netherlands.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ape_5.7-1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.10      lattice_0.20-45  tidyr_1.3.0      fansi_1.0.4      digest_0.6.31    utf8_1.2.3      
 [7] dplyr_1.1.0      grid_4.2.3       R6_2.5.1         nlme_3.1-162     lifecycle_1.0.3  magrittr_2.0.3  
[13] pillar_1.9.0     rlang_1.1.0      cli_3.6.1        rstudioapi_0.14  vctrs_0.6.1      generics_0.1.3  
[19] tools_4.2.3      glue_1.6.2       purrr_1.0.1      parallel_4.2.3   compiler_4.2.3   pkgconfig_2.0.3 
[25] tidyselect_1.2.0 tibble_3.2.0    

In the cases where njs does generate a tree, trying to plot it leads to very high memory usage (11gb) causing RStudio to crash.

Thanks! These three functions now check that there are at least three observations. I added the same check in fastme.bal and fastme.ols.