jianhong / ATACseqQC

ATAC-seq Quality Control

Home Page:https://jianhong.github.io/ATACseqQC/articles/ATACseqQC.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problem with shiftGAlignmentsList()

jingyu0602 opened this issue · comments

Thanks for your package! I'm going to generate a TSSE score plot for an bulk ATAC-seq data and I use code below:

possibleTag <- list("integer"=c("AM", "AS", "CM", "CP", "FI", "H0", "H1", "H2",
"HI", "IH", "MQ", "NH", "NM", "OP", "PQ", "SM",
"TC", "UQ"),
"character"=c("BC", "BQ", "BZ", "CB", "CC", "CO", "CQ", "CR",
"CS", "CT", "CY", "E2", "FS", "LB", "MC", "MD",
"MI", "OA", "OC", "OQ", "OX", "PG", "PT", "PU",
"Q2", "QT", "QX", "R2", "RG", "RX", "SA", "TS",
"U2"))
library(Rsamtools)
bamTop100 <- scanBam(BamFile(bamfile, yieldSize = 100),
param = ScanBamParam(tag=unlist(possibleTag)))[[1]]$tag
tags <- names(bamTop100)[lengths(bamTop100)>0]

outPath <- "splited"
dir.create(outPath)

library(BSgenome.Hsapiens.UCSC.hg19)
seqlev <- "chr1" ## subsample data for quick run
which <- as(seqinfo(Hsapiens)[seqlev], "GRanges")
library(ATACseqQC)
gal <- readBamFile(bamfile, tag=tags, which=which, asMates=TRUE, bigFile=TRUE)
shiftedBamfile <- file.path(outPath, "shifted.bam")
gal1 <- shiftGAlignmentsList(gal, outbam=shiftedBamfile)

It runs smoothly using data in the package, but when I use my bam file, the last command runs more than 10 hours and gives a warning:
> gal1 <- shiftGAlignmentsList(gal, outbam=shiftedBamfile)
[bam_sort_core] merging from 68 files and 1 in-memory blocks...

It seems that gal1 object is created but the shifted.bam is not. So I failed to do downstream analysis. Is there anything wrong with my steps? or any suggestions for running this command? Thanks!

Jingyu

I don't think you will get gal1 if you did not get shifted bam file. Could you just try to sub sample your data to a small file by samtools and then try it again. I did not see any issue with your code. If there is no error message, I can not do debug in my side.

Jianhong.

Thanks for your reply!
It works after 10% sub sampling so maybe the failure is due to memory problem, so I will increase the memory and try it again.

Jingyu