raphael-group / chisel

CHISEL -- Copy-number Haplotype Inference in Single-cell by Evolutionary Links

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Noisy RDR and CN calls

yifnzhao opened this issue · comments

Hi Simone,

I tried to run CHISEL on my scWGS dataset (0.5x, ~100 cells, 5Mb bins) with matched blood bulk sample (15x). Somehow the RDR profile and copy number calls are much noisier compared to BAF profile.
A_5Mb baf
A_5Mb rdr
A_5Mb allelecn

Below are RDR and read count (per 5Mb bin) tracks of a representative cell, extracted from CHISEL's output (calls.tsv). I think there might be some problem in the RDR normalization step and am wondering whether you have suggestions on how to fix it. Thanks!

count_rep
rdr_rep

Yifan

Hi Yifan,

I think it could be an issue with either the GC correction that failed or noise in the bulk sample. I would suggest to try the following to alternatives:

  1. Please try to run the new nonormal version of CHISEL, which does not require any matched normal sample (still requires phased SNPs though). If you install the most-updated version of CHISEL, you should be able to read the details of the command by running chisel_nonormal -h. Also, the interface of the command should be the same as the default chisel command but you simply do not input any matched normal sample.

  2. You can try to run the nonormal version of CHISEl also disabling the GC correction by adding the flag --nogccorr

Please let us know how these solutions look like and we can suggest further

Hi Simone,

Thanks for your suggestions. I tried both nonormal and nonormal with --nogccorr modes, and I think the adjusted read depth profile looks much more reasonable to me now.

The RDR plots are below (top= nonormal, bottom= nonormal --nogccorr):

image
image

Wonderful! Since the results are very similar, I would keep GC correction activated (even though GC bias might be relatively limited within 5Mb bins).

Also, I would try to adjust the paramters for clone identification given your lower number of cells, please see the instructions in the Reccomendations section.

I will close the issue for now, but please feel free to re-open it in case of further issues.

Hi Simone,

When running chisel_nonormal for one of my samples (around 100 cells), I encountered the following error at the art_illumina step:

�[92m[2023-Feb-23 04:25:14]BAM has been identified as paired-end sequencing with read length 151 and fragment size 344 (sd: 248)�[0m
�[92m[2023-Feb-23 04:25:14]Simulating sequencing reads�[0m
�[91m[2023-Feb-23 04:25:14]ART Illumina simuation of sequencing reads failed:

    ====================ART====================
             ART_Illumina (2008-2016)          
          Q Version 2.5.8 (June 6, 2016)       
     Contact: Weichun Huang <whduke@gmail.com> 
    -------------------------------------------


None
�[0m

The full error message can be found here.

Thanks,
Yifan