JonathanShor / DoubletDetection

Doublet detection in single-cell RNA-seq data.

Home Page:https://doubletdetection.readthedocs.io/en/stable/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Phenograph params return NA on PBMC8K

TomKellyGenetics opened this issue · comments

I suspect something has been disrupted by switching to the new methods. I'm testing the Phenotype implementation for comparisons to other methods in R.

Testing on the PBMC8K datasets, with defaults it returns: 524 doublets and 7857 singlets.

clf = doubletdetection.BoostClassifier(n_iters=50, use_phenograph=False, standard_scaling=True)
doublets = clf.fit(raw_counts).predict(p_thresh=1e-16, voter_thresh=0.5)

However, running phenograph gives 8378 singlets and 3 nan values.

clf = doubletdetection.BoostClassifier(n_iters=50, use_phenograph=True, standard_scaling=False)
doublets = clf.fit(raw_counts).predict(p_thresh=1e-16, voter_thresh=0.5)

This occurs if use_phenograph is True or False. I suspect this means that the scaling is now required by the scanpy pca functions (to transpose it perhaps). While standard_scaling=False is recommended, this cannot be run in the current version.

Update: I'm currently testing whether Phenograph can be run with standard_scaling=True in python. The reticulate version seems to be working on a transposed matrix.

Hi Tom,

The new thresholds only apply when not using PhenoGraph and using standard scaling.

If using PhenoGraph (which we haven't tested with standard scaling on), it's suggested to use the default thresholds.

Yeah I think that’s a good approach in the future. The main reason I’m trying to get the phenograph version working is to compare to the R version. I want to know if the native R version can reproduce similar results before updating it to match the Louvain version in python.

I’ll look at using the old thresholds for predict. I think there were some NAs in the object scores as well but I’ll check when this happens. I tested it with or without scaling but I haven’t had time to look into the results yet. I’ll post the code here if it turns out to be a reproducible issue.

Closing due to inactivity.