Teichlab / bbknn

Batch balanced KNN

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Any plan on making R package for bbknn?

MJ-Yang opened this issue · comments

I really like your robustness of integration especially the fast performance.
I kind of prefer using R solely due to their intuitiveness.
So are there any plan on making R package for bbknn? It would be really wonderful

Thanks for the kind words. At this point, there are no plans to make an R package, but BBKNN is perfectly runnable via reticulate. See here for basic syntax to get a neighbourhood graph. To get post-BBKNN UMAP coordinates, do this:

library(reticulate)
use_python("/usr/bin/python3")

anndata = import("anndata",convert=FALSE)
bbknn = import("bbknn", convert=FALSE)
sc = import("scanpy.api",convert=FALSE)

adata = anndata$AnnData(X=pca, obs=batch)
sc$tl$pca(adata)
adata$obsm$X_pca = pca
bbknn$bbknn(adata,batch_key=0)
sc$tl$umap(adata)
umap = py_to_r(adata$obsm$X_umap)

Given a vector of batch assignments named batch and a PCA matrix with cells as rows named pca, this creates an AnnData object and runs PCA on appropriately shaped dummy data (e.g. your PCA matrix) in order to create an appropriately formatted .obsm, replaces it with the actual PCA coordinates, runs BBKNN and UMAP and then ports the resulting UMAP coordinates back to R. You could easily expand this with clustering if desired.