gavinha / TitanCNA

Hi

I am trying to run snakemake version of pipeline using human_g1k_v37.fasta genome using

snakemake -s TitanCNA.snakefile --cores 5

The run was Okay till this command is executed:

Rscript /home/a/asna2/TitanCNA/scripts/R_scripts/titanCNA.R --hetFile results/titan/tumCounts/tumor_sample_1.tumCounts.txt --cnFile results/ichorCNA/tumor_sample_1/tumor_sample_1.correctedDepth.txt --outFile results/titan/hmm/titanCNA_ploidy3/tumor_sample_1_cluster1.titan.txt --outSeg results/titan/hmm/titanCNA_ploidy3/tumor_sample_1_cluster1.segs.txt --outParam results/titan/hmm/titanCNA_ploidy3/tumor_sample_1_cluster1.params.txt --outIGV results/titan/hmm/titanCNA_ploidy3/tumor_sample_1_cluster1.seg --outPlotDir results/titan/hmm/titanCNA_ploidy3/tumor_sample_1_cluster1/ --libdir /home/a/asna2/TitanCNA/ --id tumor_sample_1 --numClusters 1 --numCores 1 --normal_0 0.5 --ploidy_0 3 --genomeStyle NCBI --genomeBuild hg19 --cytobandFile None --chrs "c(1:22, "X")" --estimateNormal map --estimatePloidy True --estimateClonality True --centromere /home/a/asna2/miniconda3/pkgs/r-ichorcna-0.1.0.20180710-r341_0/lib/R/library/ichorCNA/extdata/GRCh37.p13_centromere_UCSC-gapTable.txt --alphaK 10000 --txnExpLen 1e15 --plotYlim "c(-2,4)" > logs/titan/hmm/titanCNA_ploidy3/tumor_sample_1_cluster1.log 2> logs/titan/hmm/titanCNA_ploidy3/tumor_sample_1_cluster1.log

returned non-zero exit status 1.
File "/home/a/asna2/TitanCNA/scripts/snakemake/TitanCNA.snakefile", line 62, in __rule_runTitanCNA
File "/home/a/asna2/miniconda3/lib/python3.6/concurrent/futures/thread.py", line 56, in run

The tumor_sample_1_cluster1.log file showed

$dens.bw
[1] 0.379763

$scat
[1] 0.08379496

$S_Dbw
[1] 0.463558

nd in "RSQLite" for requests: dbGetQuery
Running TITAN...
titan: Loading data results/titan/tumCounts/tumor_sample_1.tumCounts.txt
titan: Loading GC content and mappability corrected log2 ratios...
titan: Extracting read depth...
Removed 79 centromeric positions
Removed Chrs:
titan: Loading default parameters
titan: Using 1 cores.
titan: Parameter estimation
Optimal state path computation: Using 1 cores.
Writing results to results/titan/hmm/titanCNA_ploidy2/tumor_sample_1_cluster1.titan.txt, results/titan/hmm/titanCNA_ploidy2/tumor_sample_1_cluster1.segs.txt, results/titan/hmm/titanCNA_ploidy2/tumor_sample_1_cluster1.params.txt
titan: Saving parameters to results/titan/hmm/titanCNA_ploidy2/tumor_sample_1_cluster1.params.txt
Error in cytoband[, "chrom"] : incorrect number of dimensions
Calls: plotIdiogram
In addition: Warning messages:
1: In .replace_seqlevels_style(x_seqlevels, value) :
found more than one best sequence renaming map compatible with seqname style "NCBI" for this object, using the first one
2: In .replace_seqlevels_style(x_seqlevels, value) :
found more than one best sequence renaming map compatible with seqname style "NCBI" for this object, using the first one
3: In .replace_seqlevels_style(x_seqlevels, value) :
found more than one best sequence renaming map compatible with seqname style "NCBI" for this object, using the first one
4: In .replace_seqlevels_style(x_seqlevels, value) :
found more than one best sequence renaming map compatible with seqname style "NCBI" for this object, using the first one
5: In .replace_seqlevels_style(x_seqlevels, value) :
found more than one best sequence renaming map compatible with seqname style "NCBI" for this object, using the first one
Execution halted

I am not sure about what causing the problem but I showed here the the reference settings that have used

reference settings and paths to reference files

genomeBuild: hg19
genomeStyle: NCBI
refFasta: /home/a/asna2/Scrach/WES/human_g1k_v37.fasta
snpVCF: /home/a/asna2/Scrach/WES/hapmap_3.3.b37.vcf.gz
ichorCNA_exons: NULL
cytobandFile: None
centromere: /home/a/asna2/miniconda3/pkgs/r-ichorcna-0.1.0.20180710-r341_0/lib/R/library/ichorCNA/extdata/GRCh37.p13_centromere_UCSC-gapTable.txt
sex: females # use None if both females and males are in sample set

The TitanCNV parameters are

use this for NCBI chr naming

chrs: 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y

TitanCNA params

TitanCNA_maxNumClonalClusters: 5
TitanCNA_chrs: c(1:22, "X")
TitanCNA_normalInit: 0.5
TitanCNA_maxPloidy: 3
TitanCNA_estimateNormal: map
TitanCNA_estimatePloidy: TRUE
TitanCNA_estimateClonality: TRUE
TitanCNA_alleleModel: binomial
TitanCNA_alphaK: 10000
TitanCNA_alphaR: 10000
TitanCNA_txnExpLen: 1e15
TitanCNA_plotYlim: c(-2,4)
TitanCNA_solutionThreshold: 0.05
TitanCNA_numCores: 1
TitanCNA_mem: 16G
TitanCNA_runtime: "300:00:00"
TitanCNA_pe: -pe smp 1 -binding linear:1

Any assistance is appreciated.

Hi @AAlhendi1707

Are you using the latest commit of TitanCNA? It should not look for the cytoband object to plot the idiogram if you are analyzing for hg19. From your command to run titanCNA.R, it appears correct that --cytoband None and --genomeBuild hg19 --genomeStyle NCBI.

These are the only sections of the code that makes use of the cytoband object.

TitanCNA/scripts/R_scripts/titanCNA.R

Lines 292 to 296 in eb288ea

    
           if (genomeBuild == "hg38" && file.exists(cytobandFile)){ 
        
           	cytoband <- as.data.frame(fread(cytobandFile)) 
        
           	names(cytoband) <- c("chrom", "start", "end", "name", "gieStain") 
        
           	#cytoband$V1 <- setGenomeStyle(cytoband$V1, genomeStyle = genomeStyle) 
        
           }

TitanCNA/scripts/R_scripts/titanCNA.R

Lines 329 to 337 in eb288ea

    
           	if (genomeBuild == "hg38"){ 
        
           		sl <- seqlengths(seqinfo[chr]) 
        
             		pI <- plotIdiogram.hg38(chr, cytoband=cytoband, seqinfo=seqinfo, xlim=c(0, max(sl)), unit="bp", label.y=label.y, new=FALSE, ylim=ylim)	 
        
             	}else{ 
        
             		pI <- plotIdiogram(chr, build="hg19", unit="bp", label.y=-0.35, label.y, new=FALSE, ylim=ylim)	 
        
             	} 
        
           	dev.off() 
        
           }

Also, the error appears in plotting.R

TitanCNA/R/plotting.R

Lines 654 to 667 in eb288ea

    
           plotIdiogram.hg38 <- function (chromosome, cytoband, seqinfo, cytoband.ycoords, xlim, 
        
               ylim = c(0, 2), new = TRUE, label.cytoband = TRUE, label.y = NULL, 
        
               srt, cex.axis = 1, outer = FALSE, taper = 0.15, verbose = FALSE, 
        
               unit = c("bp", "Mb"), is.lattice = FALSE, ...) 
        
           { 
        
               def.par <- par(no.readonly = TRUE, mar = c(4.1, 0.1, 3.1, 
        
                   2.1)) 
        
               on.exit(def.par) 
        
               if (is.lattice) { 
        
                   segments <- lsegments 
        
                   polygon <- lpolygon 
        
               } 
        
               cytoband <- cytoband[cytoband[, "chrom"] == chromosome, ]

Are you able to load the .RData file for this sample that failed? Can you print the contents of the object opt to see if the options from the command line are correctly read?

-Gavin

Hi Gavin

Yes, I am using the latest version of TitanCNV (Version: 1.19.1). I know this error does not make sense, as it's supposed not look for the cytoband object to plot the diagram with hg19!

I have checked the contents of TitanCNA/scripts/R_scripts/titanCNA.R, and TitanCNA/R/plotting.R which are basically identical to what I use in my pipline!

Please find the tumor_sample_1_cluster1.RData in this donwload link https://1drv.ms/u/s!AsmG9t2-7Q1ohHxei7ehAEjk_clB

Furthermore, I have printted out the 'opt' contents:

the opt contents are printed below
$id
[1] "tumor_sample_1"

$hetFile
[1] "results/titan/tumCounts/tumor_sample_1.tumCounts.txt"

$cnFile
[1] "results/ichorCNA/tumor_sample_1/tumor_sample_1.correctedDepth.txt"

$numClusters
[1] 1

$numCores
[1] 1

$ploidy_0
[1] 2

$estimatePloidy
[1] TRUE

$normal_0
[1] 0.5

$estimateNormal
[1] "map"

$estimateClonality
[1] TRUE

$maxCN
[1] 8

$alphaK
[1] 10000

$alphaKHigh
[1] 10000

$txnExpLen
[1] 1e+15

$txnZStrength
[1] 1

$minDepth
[1] 10

$maxDepth
[1] 1000

$skew
[1] 0

$minClustProportion
[1] 0.05

$genomeStyle
[1] "NCBI"

$genomeBuild
[1] "hg19"

$chrs
[1] "c(1:22, "X")"

$gender
[1] "male"

$mapThres
[1] 0.9

$centromere
[1] "/home/a/asna2/miniconda3/pkgs/r-ichorcna-0.1.0.20180710-r341_0/lib/R/library/ichorCNA/extdata/GRCh37.p13_centromere_UCSC-gapTable.txt"

$cytobandFile
[1] "None"

$libdir
[1] "/home/a/asna2/TitanCNA/"

$outFile
[1] "results/titan/hmm/titanCNA_ploidy2/tumor_sample_1_cluster1.titan.txt"

$outSeg
[1] "results/titan/hmm/titanCNA_ploidy2/tumor_sample_1_cluster1.segs.txt"

$outIGV
[1] "results/titan/hmm/titanCNA_ploidy2/tumor_sample_1_cluster1.seg"

$outParam
[1] "results/titan/hmm/titanCNA_ploidy2/tumor_sample_1_cluster1.params.txt"

$outPlotDir
[1] "results/titan/hmm/titanCNA_ploidy2/tumor_sample_1_cluster1/"

$plotYlim
[1] "c(-2,4)"

$verbose
[1] FALSE

$help
[1] FALSE

What do you suggest?

Thanks!

Hi @AAlhendi1707

I am unsure what is going on here. You can just simply comment out the lines

TitanCNA/scripts/R_scripts/titanCNA.R

Lines 329 to 334 in eb288ea

    
           if (genomeBuild == "hg38"){ 
        
           	sl <- seqlengths(seqinfo[chr]) 
        
            		pI <- plotIdiogram.hg38(chr, cytoband=cytoband, seqinfo=seqinfo, xlim=c(0, max(sl)), unit="bp", label.y=label.y, new=FALSE, ylim=ylim)	 
        
            	}else{ 
        
            		pI <- plotIdiogram(chr, build="hg19", unit="bp", label.y=-0.35, label.y, new=FALSE, ylim=ylim)	 
        
            	}

But leave line 333

TitanCNA/scripts/R_scripts/titanCNA.R

Line 333 in eb288ea

    
           pI <- plotIdiogram(chr, build="hg19", unit="bp", label.y=-0.35, label.y, new=FALSE, ylim=ylim)

Best,
Gavin

Thanks for this discussion. I also ran into this issue with build 37 and have a small PR that appears to fix it for my test case. It's exactly the line Gavin highlighted but it has label.y passed twice which gets misinterpreted as a cytoband argument. Hope this helps.

	if (genomeBuild == "hg38" && file.exists(cytobandFile)){
	cytoband <- as.data.frame(fread(cytobandFile))
	names(cytoband) <- c("chrom", "start", "end", "name", "gieStain")
	#cytoband$V1 <- setGenomeStyle(cytoband$V1, genomeStyle = genomeStyle)
	}

	if (genomeBuild == "hg38"){
	sl <- seqlengths(seqinfo[chr])
	pI <- plotIdiogram.hg38(chr, cytoband=cytoband, seqinfo=seqinfo, xlim=c(0, max(sl)), unit="bp", label.y=label.y, new=FALSE, ylim=ylim)
	}else{
	pI <- plotIdiogram(chr, build="hg19", unit="bp", label.y=-0.35, label.y, new=FALSE, ylim=ylim)
	}

	dev.off()
	}

	plotIdiogram.hg38 <- function (chromosome, cytoband, seqinfo, cytoband.ycoords, xlim,
	ylim = c(0, 2), new = TRUE, label.cytoband = TRUE, label.y = NULL,
	srt, cex.axis = 1, outer = FALSE, taper = 0.15, verbose = FALSE,
	unit = c("bp", "Mb"), is.lattice = FALSE, ...)
	{
	def.par <- par(no.readonly = TRUE, mar = c(4.1, 0.1, 3.1,
	2.1))
	on.exit(def.par)
	if (is.lattice) {
	segments <- lsegments
	polygon <- lpolygon
	}

	cytoband <- cytoband[cytoband[, "chrom"] == chromosome, ]