Parallelization in dist.topo
V-Z opened this issue · comments
I asked about possibility to add parallelization to dist topo using parallel
and doParallel
. From the discussion I copy here some comments from Klaus Schliep:
Defaulting to "cores = detectCores()" is not a good idea. The CRAN Repository Policy states: "If running a package uses multiple threads/cores it must never use more than two simultaneously: the check farm is a shared resource and will typically be running many checks simultaneously." In short such a default can cause serious problems on a cluster and I got properly ripleyed for this years ago, but that's another story. So the default should be something like min(2L, detectCores()).
and
From my testing it works well. I'm not sure if
tryCatch
is really needed there.I am not sure if it is necessary, but you don't want to have it in the inner loop ;)
The proposal for discussion is in PR #71
Hi,
I modified dist.topo
following your suggestions. The code is updated on GH. The new option is now named mc.cores
(like in boot.phylo
) with 1 as default (see Klaus's comment on r-sig-phylo).
I also improved the code for the score distance which is twice faster with a single core/CPU. Interestingly, the score distance is more sensitive to the number of cores than the default distance. For the latter, the improvement is visible with many trees (200+) with 50+ tips.
Best, E.