JonathanShor / DoubletDetection

Doublet detection in single-cell RNA-seq data.

Home Page:https://doubletdetection.readthedocs.io/en/stable/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

setting random seed for reproducibility?

yueqiw opened this issue · comments

Hi,

It may be a good idea to allow setting random seed, so that other people can reproduce the exact same classifier given the same inputs.

I agree with you. Unfortunately, one of the packages dependencies, Phenograph, does not provide such an option. This is a major source of randomness, and without a way for it to be deterministic, no full solution is possible.

However, we could look into making the synthetic creation process more controllable/repeatable in one of the next updates.

There are some plans to modify some parts of Phenograph to improve its integration with this project, eventually. We'll be sure to investigate adding a random seed parameter to the list of features to add, but this will likely be a bit further off.

I'm not 100% sure but seems like we can use np.random.seed to ensure that the same synthetic doublets are made. We can also set the seed of PCA.

This much seems easy enough for v2.4.

Closed with #111.