neighbors_within_batch argument usage in R?

Question

neighbors_within_batch argument usage in R?

rwelling520 opened this issue 3 years ago · comments

Hello! I'm trying to recapitulate some results (using bbknn in R) from a paper that uses bbknn for scRNAseq batch correction where they say they set neighbors_within_batch to 10 but I'm running into an issue. I'm able to run the code fine in R without setting a neighbors_within_batch argument (see a possible edit below to the end py_to_r code). When I set the neighbors_within_batch bbknn$bbknn(adata, batch_key=0, neighbors_within_batch=10) I get an error with this traceback:

Error in py_call_impl(callable, dots$args, dots$keywords) :
TypeError: 'float' object cannot be interpreted as an integer
5.
stop(structure(list(message = "TypeError: 'float' object cannot be interpreted as an integer",
call = py_call_impl(callable, dots$args, dots$keywords),
cppstack = structure(list(file = "", line = -1L, stack = c("1 reticulate.so 0x0000000114a023ed _ZN4Rcpp9exceptionC2EPKcb + 221",
"2 reticulate.so 0x0000000114a0a485 _ZN4Rcpp4stopERKNSt3__112basic_stringIcNS0_11char_traitsIcEENS0_9allocatorIcEEEE + 53", ...
4.
get_graph at init.py#148
3.
bbknn_pca_matrix at init.py#355
2.
bbknn at init.py#294
1.
bbknn$bbknn(adata, batch_key = 0, neighbors_within_batch = 10)

My full code is as follows (pca matrix and batch assignment vector not shown; I don't think either of these is causing the error since I can run this code minus the neighbors_within_batch argument but happy to post how I generated them/what they contain if useful):

adata = anndata$AnnData(X=pca, obs=batch)
sc$tl$pca(adata)
adata$obsm$X_pca = pca
bbknn$bbknn(adata, batch_key=0, neighbors_within_batch=10)
sc$tl$umap(adata)
umap = py_to_r(adata$obsm[["X_umap"]])

I'm at a loss for what's causing this error... do you have any idea what I'm doing wrong? I'm assuming I can use the neighbors_within_batch parameter in R?

Also, I think umap = py_to_r(adata$obsm$X_umap) should be umap = py_to_r(adata$obsm[["X_umap"]])? I was only able to get the latter to work...

Thanks,
Rachel

P.S. I'm sorry if I've missed including anything or if this looks funky when it gets posted; this is the first time I've asked about an issue on github. Happy to provide more details if needed!

Krzysztof Polanski · Answer 1 · Wed Apr 21 2021 15:14:50 GMT+0800 (China Standard Time)

There's TypeError: 'float' object cannot be interpreted as an integer right at the top of your error. Maybe whatever R to python conversion turns your 10 into a float, and the python innards are not happy about it? Try doing as.integer(10) instead?

Rachel Wellington · Answer 2 · Thu Apr 22 2021 02:23:12 GMT+0800 (China Standard Time)

Thank you for the quick reply!

I ended up working more on this after posting (I also thought I might have to switch into Python, but wanted to see if I could fix the error in R), and I figured out that if I set neighbors_within_batch=as.integer(10) instead of neighbors_within_batch=10 it fixes the type error and everything seems to run smoothly thereon out. It seems that for whatever reason we have to instruct the function that the R value is an integer. In case anyone else sees this thread later, this also works for inputs of numbers into any function being used via reticulate (i.e. can also increase number of calculated umap dimensions with sc$tl$umap(adata, n_components=as.integer(10)) where simply inserting 10 also results in an error, although a different one referencing splicing of non-integers).

Krzysztof Polanski · Answer 3 · Thu Apr 22 2021 02:29:14 GMT+0800 (China Standard Time)

Cool, I'll take this insight and work it into the (scant) reticulate info in the readme.

Rachel Wellington · Answer 4 · Thu Apr 22 2021 02:32:26 GMT+0800 (China Standard Time)

Thank you! Sorry, I know y'all didn't really want to work with an R version, but thank you for making the effort to help with R issues anyway :)