Error in if (length(expected) > 1 && x > min(expected) && x < max(expected)) return(0): missing value where TRUE/FALSE needed
twoneu opened this issue · comments
Hi! Thank you for this great tool. I am encountering the error in the title when running scDblFinder on a large dataset (CellRanger estimated ~20,000 cells):
Assuming the input to be a matrix of counts or expected counts.
Aggregating features...
Warning message:
"Quick-TRANSfer stage steps exceeded maximum (= 1905250)"
Creating ~11084 artificial doublets...
Dimensional reduction
Evaluating kNN...
Training model...
Error in if (length(expected) > 1 && x > min(expected) && x < max(expected)) return(0): missing value where TRUE/FALSE needed
I have not encountered this error in several other (much smaller) samples I have tried, so is this related to the dataset being too large?
Traceback:
1. scDblFinder(peak_assay, aggregateFeatures = TRUE, nfeatures = 25,
. processing = "normFeatures")
2. .scDblscore(d, scoreType = score, addVals = pca[, includePCs,
. drop = FALSE], threshold = threshold, dbr = dbr, dbr.sd = dbr.sd,
. nrounds = nrounds, max_depth = max_depth, iter = iter, BPPARAM = BPPARAM,
. features = trainingFeatures, verbose = verbose, metric = metric,
. filterUnidentifiable = removeUnidentifiable, unident.th = unident.th)
3. which((d$type == "real" & doubletThresholding(d, dbr = dbr, dbr.sd = dbr.sd,
. stringency = 0.7, perSample = perSample, returnType = "call") ==
. "doublet") | (d$type == "doublet" & d$score < unident.th &
. filterUnidentifiable) | !d$include.in.training)
4. doubletThresholding(d, dbr = dbr, dbr.sd = dbr.sd, stringency = 0.7,
. perSample = perSample, returnType = "call")
5. .optimThreshold(d, dbr = .gdbr(d, dbr), dbr.sd = dbr.sd, stringency = stringency)
6. optimize(totfn, c(0, 1), maximum = FALSE)
7. (function (arg)
. f(arg, ...))(0.381966011250105)
8. f(arg, ...)
9. .prop.dev(d$type, d$score, expected, x)
Session info
`R version 4.2.2 (2022-10-31)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS
Matrix products: default
BLAS/LAPACK: /opt/conda/envs/NET_R_env/lib/libopenblasp-r0.3.21.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] BSgenome.Hsapiens.UCSC.hg38_1.4.5 BSgenome_1.66.3
[3] Biostrings_2.66.0 XVector_0.38.0
[5] CopyscAT_0.40 MASS_7.3-60
[7] jsonlite_1.8.4 sp_1.6-1
[9] rtracklayer_1.58.0 gplots_3.1.3
[11] tibble_3.2.1 tidyr_1.3.0
[13] edgeR_3.40.2 limma_3.54.2
[15] stringr_1.5.0 mclust_6.0.0
[17] changepoint_2.2.4 zoo_1.8-12
[19] data.table_1.14.8 igraph_1.4.3
[21] FNN_1.1.3.2 Rtsne_0.16
[23] biomaRt_2.54.0 fastcluster_1.2.3
[25] NMF_0.26 cluster_2.1.4
[27] rngtools_1.5.2 registry_0.5-1
[29] viridis_0.6.3 viridisLite_0.4.2
[31] dplyr_1.1.2 RColorBrewer_1.1-3
[33] scDblFinder_1.13.13 SingleCellExperiment_1.20.1
[35] SummarizedExperiment_1.28.0 MatrixGenerics_1.10.0
[37] matrixStats_0.63.0 glue_1.6.2
[39] ggplot2_3.4.1 EnsDb.Hsapiens.v86_2.99.0
[41] ensembldb_2.22.0 AnnotationFilter_1.22.0
[43] GenomicFeatures_1.50.2 AnnotationDbi_1.60.0
[45] Biobase_2.58.0 GenomicRanges_1.50.2
[47] GenomeInfoDb_1.34.9 IRanges_2.32.0
[49] S4Vectors_0.36.2 BiocGenerics_0.44.0
[51] Signac_1.10.0 SeuratObject_4.1.3
[53] Seurat_4.3.0
loaded via a namespace (and not attached):
[1] rappdirs_0.3.3 pbdZMQ_0.3-9
[3] scattermore_1.0 bit64_4.0.5
[5] irlba_2.3.5.1 DelayedArray_0.24.0
[7] KEGGREST_1.38.0 RCurl_1.98-1.12
[9] doParallel_1.0.17 generics_0.1.3
[11] ScaledMatrix_1.6.0 cowplot_1.1.1
[13] RSQLite_2.2.20 RANN_2.6.1
[15] future_1.32.0 bit_4.0.5
[17] spatstat.data_3.0-1 xml2_1.3.4
[19] httpuv_1.6.11 hms_1.1.3
[21] evaluate_0.21 promises_1.2.0.1
[23] fansi_1.0.4 restfulr_0.0.15
[25] progress_1.2.2 caTools_1.18.2
[27] dbplyr_2.3.1 DBI_1.1.3
[29] htmlwidgets_1.6.2 spatstat.geom_3.2-1
[31] purrr_1.0.1 ellipsis_0.3.2
[33] gridBase_0.4-7 deldir_1.0-9
[35] sparseMatrixStats_1.10.0 vctrs_0.6.2
[37] ROCR_1.0-11 abind_1.4-5
[39] cachem_1.0.8 withr_2.5.0
[41] progressr_0.13.0 sctransform_0.3.5
[43] GenomicAlignments_1.34.1 prettyunits_1.1.1
[45] scran_1.26.2 goftest_1.2-3
[47] IRdisplay_1.1 lazyeval_0.2.2
[49] crayon_1.5.2 spatstat.explore_3.2-1
[51] pkgconfig_2.0.3 nlme_3.1-162
[53] vipor_0.4.5 ProtGenerics_1.30.0
[55] rlang_1.1.0 globals_0.16.2
[57] lifecycle_1.0.3 miniUI_0.1.1.1
[59] filelock_1.0.2 BiocFileCache_2.6.0
[61] rsvd_1.0.5 polyclip_1.10-4
[63] lmtest_0.9-40 Matrix_1.5-4
[65] IRkernel_1.3.2 base64enc_0.1-3
[67] beeswarm_0.4.0 ggridges_0.5.4
[69] png_0.1-8 rjson_0.2.21
[71] bitops_1.0-7 KernSmooth_2.23-21
[73] blob_1.2.3 DelayedMatrixStats_1.20.0
[75] parallelly_1.35.0 spatstat.random_3.1-5
[77] beachmat_2.14.2 scales_1.2.1
[79] memoise_2.0.1 magrittr_2.0.3
[81] plyr_1.8.8 ica_1.0-3
[83] zlibbioc_1.44.0 compiler_4.2.2
[85] dqrng_0.3.0 BiocIO_1.8.0
[87] fitdistrplus_1.1-11 Rsamtools_2.14.0
[89] cli_3.6.1 listenv_0.9.0
[91] patchwork_1.1.2 pbapply_1.7-0
[93] tidyselect_1.2.0 stringi_1.7.12
[95] yaml_2.3.7 BiocSingular_1.14.0
[97] locfit_1.5-9.7 ggrepel_0.9.3
[99] grid_4.2.2 fastmatch_1.1-3
[101] tools_4.2.2 future.apply_1.10.0
[103] parallel_4.2.2 uuid_1.1-0
[105] bluster_1.8.0 foreach_1.5.2
[107] metapod_1.6.0 gridExtra_2.3
[109] digest_0.6.31 BiocManager_1.30.20
[111] shiny_1.7.4 Rcpp_1.0.10
[113] scuttle_1.8.4 later_1.3.1
[115] RcppAnnoy_0.0.20 httr_1.4.5
[117] colorspace_2.1-0 XML_3.99-0.14
[119] tensor_1.5 reticulate_1.28
[121] splines_4.2.2 uwot_0.1.14
[123] RcppRoll_0.3.0 statmod_1.5.0
[125] spatstat.utils_3.0-3 scater_1.26.1
[127] xgboost_1.7.5.1 plotly_4.10.1
[129] xtable_1.8-4 R6_2.5.1
[131] pillar_1.9.0 htmltools_0.5.5
[133] mime_0.12 fastmap_1.1.1
[135] BiocParallel_1.32.6 BiocNeighbors_1.16.0
[137] codetools_0.2-19 utf8_1.2.3
[139] lattice_0.21-8 spatstat.sparse_3.0-1
[141] curl_4.3.3 ggbeeswarm_0.7.2
[143] leiden_0.4.3 gtools_3.9.4
[145] survival_3.5-5 repr_1.1.6
[147] munsell_0.5.0 GenomeInfoDbData_1.2.9
[149] iterators_1.0.14 reshape2_1.4.4
[151] gtable_0.3.3 `
Hi,
20k cells certainly shouldn't be a problem, how many peaks do you have? Do you have mbkmeans installed?
Can you reproduce the error without the aggregation (i.e. just scDblFinder(peak_assay)
) ?
I've never seen this error, so if it'd be possible to share the object (ideally smaller if you can still reproduce the error, genes/samples can obviously be scrambled) it'll make debugging easier.
Pierre-Luc
Hi @twoneu ,
please respond or I'll close the issue.
I also have same error. Did you solve it? I have also just 20000 cell( in cellranger websummary). but My another data have also 20000cell. but it ran fine without error.
Then please provide the extra info requested above.
Owner
counts <- Read10X_h5("/data/jrgong/AD_multiome/fastq/aggr_2023_0616/aggr_2023_0616/outs/filtered_feature_bc_matrix.h5")
fragpath <- "/data/jrgong/AD_multiome/fastq//aggr_2023_0616/aggr_2023_0616/outs/atac_fragments.tsv.gz"
#get gene annotations for hg 38
annotation <- GetGRangesFromEnsDb(ensdb = EnsDb.Hsapiens.v86)
seqlevelsStyle(annotation) <- "UCSC"
#create a Seurat object containing the RNA data
AH_Mul <- CreateSeuratObject(counts = counts$Gene Expression
, assay = "RNA")
#create a Seurat object containing the ATAC data
AH_Mul[["ATAC"]] <- CreateChromatinAssay(counts = counts$Peaks, sep = c(":","-"), fragments = fragpath, annotation = annotation)
AH.PBMC.7 <- subset(AH_Mul, subset=aggr_number=="7")
ce <- scDblFinder(SingleCellExperiment(list(counts=AH.PBMC.7@assays$RNA@counts)))
AH.PBMC.7$doublet_scores <- sce$scDblFinder.score
AH.PBMC.7$doublet_class <- sce$scDblFinder.class
and then I saw the error like this :
#Creating ~16000 artificial doublets...
#Dimensional reduction
#Evaluating kNN...
#Training model...
#Error in if (length(expected) > 1 && x > min(expected) && x < max(expected)) return(0) :
#missing value where TRUE/FALSE needed
could you let me know how to solve it?
If you don't have a drive/platform where you can put it, send me an email at pierre-luc.germain [at ] hest.ethz.ch and I'll give you a link.