plger / scDblFinder

Methods for detecting doublets in single-cell sequencing data

Home Page:https://plger.github.io/scDblFinder/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Running scDblFinder deterministic and serial with the 'samples' parameter

cwoehle opened this issue · comments

Hello,

I was trying to run scDblFinder with the samples parameter, set.seed(), but without BPPARAM and noticed that reproducibility was not given (Finding the same number of doublets).
Either removing the samples parameter or adding BPPARAM=MulticoreParam(1, RNGseed=seed) produced reproducible results.
However, I was searching for a way for serial execution suitable for running in RStudio (I keep having problems with BiocParallel) and needed to consider individual samples. So, after some testing I ended up using BPPARAM=SerialParam(RNGseed = seed), which seems to lead to the behaviour I was looking for.
I did not find any comment on SerialParam() in the documentation. Would this also be your suggested solution in my case or could there be a better alternative?

I´m grateful for any clarification.

Best wishes,
Christian

Hi Christian,
thanks for bringing this up, I never noticed because I always run them with multithreading, but you're probably not the only user that will face this. Yes I'd use your solution, and I now added this in the vignette (in the FAQ on reproducibility).
Best,
plger

Hi Pierre-Luc,

Yes I'd use your solution, and I now added this in the vignette (in the FAQ on reproducibility).

Not sure you what you were referring to here, but if this is this section of the vignette, I still find it confusing...

Could you maybe mention that this way of setting the seed should always be used when the samples argument is used, even when the default BPPARAM=SerialParam() is used?

you're right, wasn't clear, hope it is now.