EDePasquale / DoubletDecon

A tool for removing doublets from single-cell RNA-seq data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Some samples have an error running Main_Doublet_Decon

WangRong423 opened this issue · comments

Hi,all
I running 40 PBMC samples,but just two samples have error in Main_Doublet_Decon,
error message:
WARNING: if using ICGS2 file input, please import 'rawDataFile' and 'groupsFile' as path/location instead of an R object.
Processing raw data...
Combining similar clusters...
Creating synthetic doublet profiles...
Step 1: Removing possible doublets...
Error in lsei(AA, BB, EE, FF, GG, HH) :
NA/NaN/Inf in foreign function call (arg 6)
Calls: Main_Doublet_Decon -> Is_A_Doublet -> DeconRNASeq -> lsei
In addition: Warning messages:
1: In if (class(groupsFile) == "character") { :
the condition has length > 1 and only the first element will be used
2: In if (class(groupsFile) == "character") { :
the condition has length > 1 and only the first element will be used
Execution halted
The 4.1.2 R version is used.Please let me know how I can improve, appreciate!

I ran into the same issue and was able to resolve it by wrapping my groupsFile argument in data.frame(). In my case this looked like:

      results <- Main_Doublet_Decon(rawDataFile = processed$newExpressionFile, 
        groupsFile = data.frame(processed$newGroupsFile), 
        filename = "DoubletDecon_results",
        location = paste0(out, "/"),
        fullDataFile = NULL, 
        removeCC = FALSE, 
        species = "hsa", 
        rhop = 0.9, 
        write = TRUE, 
        PMF = TRUE, 
        useFull = FALSE, 
        heatmap = FALSE, 
        centroids=FALSE, 
        num_doubs=100, 
        only50=FALSE, 
        min_uniq=4, 
        nCores = 1)

I apologize, I spoke too soon and found that wrapping the groupsFile in data.frame altered the results. In your case, this warning shouldn't impact the results. However, I've found in more recent versions of R (R4.2.1), this causes an Error instead of a Warning. To resolve this, I changed line 106 of Main_Doublet_Decon.R from

  if(class(groupsFile)=="character"){

to

  if(class(groupsFile)[1]=="character"){

or another handling method for data types that return more than 1 class (ie matrix) would help.

Would developers please be able to merge this small fix that @drneavin has mentioned. It took me a while to find a solution in newer R versions, but thanks to his i got it working.
Thank you.

After more in depth investigations this seems to be an issue related with DeconRNASeq . Specifically line 112 here https://code.bioconductor.org/browse/DeconRNASeq/blob/RELEASE_3_18/R/DeconRNASeq.R#L112 when Is_A_Doublet is calling DeconRNASeq it uses default use.scale = TRUE which in cases when BB is all zeros result in NaN and an error in the code.
So i have changed this in my script to not scale if all are 0s such as and it seems to work. Hopefully this may help somebody else in future.

    if (use.scale) {
      if (all(BB == 0)){
        BB <- BB
      }else{
        BB <- scale(BB)
      }
    }