SCTransform example in vignette not working
JesseRop opened this issue · comments
Describe the bug
I'm trying to replicate the example in the vignette to use SCT internally within scdblFinder but I get the error below
Error in if (length(sel_features) > nfeatures) sel_features <- selFeatures(sce[sel_features, :
the condition has length > 1
MRE -- Minimal example to reproduce the bug
nfeatures <- 1000
sce <- mockDoubletSCE(dbl.rate=0.1, ngenes=2000 )
# sctransform on real cells:
vst1 <- sctransform::vst(counts(sce), n_cells=min(ncol(sce),5000), verbosity=0)
sce <- sce[row.names(vst1$y),]
logcounts(sce) <- vst1$y
hvg <- row.names(sce)[head(order(vst1$gene_attr$residual_variance, decreasing=TRUE), nfeatures)]
# define a processing function that scDblFinder will use on the real+artificial doublets;
# the input should be a count matrix and the number of dimensions, and the output a PCA matrix
myfun <- function(e, dims){
# we use the thetas calculated from the first vst on real cells
e <- e[intersect(row.names(e), row.names(vst1$model_pars_fit[which(!is.na(vst1$model_pars_fit[,"theta"])),])),]
vst2 <- sctransform::vst(e, n_cells=min(ncol(e),5000), method="nb_theta_given",
theta_given=vst1$model_pars_fit[row.names(e),"theta"],
min_cells=1L, verbosity=0)
scater::calculatePCA(vst2$y, ncomponents=dims)
}
sce <- scDblFinder(sce, processing=myfun, nfeatures=hvg)
Traceback
1: scDblFinder(sce, processing = myfun, nfeatures = hvg)
Session info
R version 4.2.2 (2022-10-31)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS 14.4Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dyliblocale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8attached base packages:
[1] grid stats4 stats graphics grDevices utils datasets methods baseother attached packages:
[1] reticulate_1.31 caret_6.0-94 ggsankey_0.0.99999 scmap_1.20.2 ggh4x_0.2.4 MetBrewer_0.2.0
[7] SoupX_1.6.2 conflicted_1.2.0 ggvenn_0.1.9 Platypus_3.4.1 RColorBrewer_1.1-3 rmarkdown_2.24
[13] Hmisc_4.7-2 Formula_1.2-4 survival_3.5-0 lattice_0.20-45 plotly_4.10.2 SeuratDisk_0.0.0.9020
[19] SeuratData_0.2.2 scDblFinder_1.12.0 SingleCellExperiment_1.20.0 SummarizedExperiment_1.28.0 Biobase_2.58.0 GenomicRanges_1.50.2
[25] GenomeInfoDb_1.34.9 IRanges_2.32.0 S4Vectors_0.36.1 BiocGenerics_0.44.0 MatrixGenerics_1.10.0 matrixStats_1.0.0
[31] cowplot_1.1.1 tidyr_1.3.0 tibble_3.2.1 readr_2.1.4 purrr_1.0.2 stringr_1.5.0
[37] dplyr_1.1.2 ggpubr_0.5.0 ggrepel_0.9.3 ggplot2_3.4.3 DoubletFinder_2.0.3 patchwork_1.1.3
[43] rgl_1.0.1 sctransform_0.3.5 SeuratObject_4.1.3 Seurat_4.3.0.1loaded via a namespace (and not attached):
[1] rsvd_1.0.5 ica_1.0-3 class_7.3-21 Rsamtools_2.14.0 foreach_1.5.2 lmtest_0.9-40
[7] crayon_1.5.2 MASS_7.3-58.2 nlme_3.1-161 backports_1.4.1 rlang_1.1.1 XVector_0.38.0
[13] ROCR_1.0-11 readxl_1.4.2 irlba_2.3.5.1 limma_3.54.2 scater_1.26.1 xgboost_1.7.3.1
[19] BiocParallel_1.32.5 rjson_0.2.21 bit64_4.0.5 glue_1.6.2 pheatmap_1.0.12 parallel_4.2.2
[25] vipor_0.4.5 spatstat.sparse_3.0-2 dotCall64_1.0-2 spatstat.geom_3.2-4 tidyselect_1.2.0 fitdistrplus_1.1-11
[31] XML_3.99-0.13 zoo_1.8-12 GenomicAlignments_1.34.0 xtable_1.8-4 magrittr_2.0.3 evaluate_0.21
[37] scuttle_1.8.4 cli_3.6.1 zlibbioc_1.44.0 rstudioapi_0.14 miniUI_0.1.1.1 sp_2.0-0
[43] rpart_4.1.19 treeio_1.22.0 maps_3.4.1 fields_14.1 shiny_1.7.5 BiocSingular_1.14.0
[49] xfun_0.40 clue_0.3-63 cluster_2.1.4 ape_5.6-2 listenv_0.9.0 Biostrings_2.66.0
[55] png_0.1-8 reshape_0.8.9 future_1.33.0 ipred_0.9-14 withr_2.5.0 bitops_1.0-7
[61] ggforce_0.4.1 plyr_1.8.8 cellranger_1.1.0 hardhat_1.3.1 pROC_1.18.5 e1071_1.7-12
[67] dqrng_0.3.0 pillar_1.9.0 GlobalOptions_0.1.2 cachem_1.0.8 hdf5r_1.3.8 GetoptLong_1.0.5
[73] googleVis_0.7.1 DelayedMatrixStats_1.20.0 vctrs_0.6.3 ellipsis_0.3.2 generics_0.1.3 lava_1.8.0
[79] tools_4.2.2 foreign_0.8-84 beeswarm_0.4.0 munsell_0.5.0 tweenr_2.0.2 proxy_0.4-27
[85] DelayedArray_0.24.0 fastmap_1.1.1 compiler_4.2.2 abind_1.4-5 httpuv_1.6.11 rtracklayer_1.58.0
[91] prodlim_2023.08.28 GenomeInfoDbData_1.2.9 gridExtra_2.3 edgeR_3.40.2 deldir_1.0-9 utf8_1.2.3
[97] later_1.3.1 recipes_1.0.10 jsonlite_1.8.7 GGally_2.1.2 scales_1.2.1 ScaledMatrix_1.6.0
[103] tidytree_0.4.2 pbapply_1.7-2 carData_3.0-5 sparseMatrixStats_1.10.0 lazyeval_0.2.2 promises_1.2.1
[109] car_3.1-1 doParallel_1.0.17 R.utils_2.12.2 latticeExtra_0.6-30 goftest_1.2-3 spatstat.utils_3.0-3
[115] checkmate_2.1.0 statmod_1.5.0 Rtsne_0.16 uwot_0.1.16 igraph_1.5.1 yaml_2.3.7
[121] htmltools_0.5.6 memoise_2.0.1 BiocIO_1.8.0 locfit_1.5-9.7 viridisLite_0.4.2 digest_0.6.33
[127] mime_0.12 rappdirs_0.3.3 spam_2.9-1 yulab.utils_0.0.6 future.apply_1.11.0 data.table_1.14.8
[133] R.oo_1.25.0 labeling_0.4.2 splines_4.2.2 RCurl_1.98-1.12 broom_1.0.3 hms_1.1.2
[139] colorspace_2.1-0 base64enc_0.1-3 BiocManager_1.30.22 ggbeeswarm_0.7.1 shape_1.4.6 aplot_0.1.9
[145] nnet_7.3-18 Rcpp_1.0.11 RANN_2.6.1 circlize_0.4.15 fansi_1.0.4 tzdb_0.4.0
[151] ModelMetrics_1.2.2.2 parallelly_1.36.0 R6_2.5.1 ggridges_0.5.4 lifecycle_1.0.3 bluster_1.8.0
[157] ggsignif_0.6.4 leiden_0.4.3 Matrix_1.5-3 RcppAnnoy_0.0.21 iterators_1.0.14 spatstat.explore_3.2-1
[163] gower_1.0.1 htmlwidgets_1.6.2 beachmat_2.14.0 polyclip_1.10-4 timechange_0.2.0 gridGraphics_0.5-1
[169] ComplexHeatmap_2.14.0 globals_0.16.2 htmlTable_2.4.1 spatstat.random_3.1-5 progressr_0.14.0 lubridate_1.9.2
[175] codetools_0.2-18 metapod_1.6.0 randomForest_4.7-1.1 R.methodsS3_1.8.2 gtable_0.3.3 DBI_1.1.3
[181] ggfun_0.0.9 tensor_1.5 httr_1.4.7 KernSmooth_2.23-20 stringi_1.7.12 reshape2_1.4.4
[187] farver_2.1.1 viridis_0.6.2 timeDate_4032.109 ggtree_3.6.2 BiocNeighbors_1.16.0 restfulr_0.0.15
[193] interp_1.1-3 ggplotify_0.1.0 scattermore_1.2 scran_1.26.2 bit_4.0.5 jpeg_0.1-10
Hi,
Thanks for the nicely reported issue.
You're using scDblFinder 1.12.0, the current release one is 1.16.0, and the ability to specify which genes to use was added in 1.13.7, which is why you're getting an error. It should work if you update; alternatively you can use the following workaround:
# same as your example up until running scDblFinder, then
# simply subset to the hvg instead of manually specifying them:
sce2 <- scDblFinder(sce[hvg,], processing=myfun, nfeatures=nfeatures)
# then put the results back in the original object:
sce$scDblFinder.score <- sce2$scDblFinder.score
sce$scDblFinder.class <- sce2$scDblFinder.class
Pierre-Luc