GuipengLi / ChIA-PET2

a versatile and flexible pipeline for analysing different variants of ChIA-PET data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ERROR in running bwa

tiagochst opened this issue · comments

Hello,

I have a problem with the second step.
If bwa mem output is a sam file shouldn't it be using -S for samtools view?

bwa_wrap GRCh38.d1.vd1.fa output/out_1.valid.fastq 1 output/out_1.valid.sam 0
Running BWA on trimmed reads ...
bwa mem -t 1 GRCh38.d1.vd1.fa output/out_1.valid.fastq | samtools view -h -F 2048 - > output/out_1.valid.sam
[bam_header_read] EOF marker is absent. The input is probably truncated.
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[main_samview] fail to read the header from "-".
ERROR in running bwa ...

Thanks!

commented

Hi, are you using short reads data? Try with "-d 1" option for short reads.

Hi,
Thanks it worked! But, now I have a problem in step 5 now.

[04-06 21:27:56] Running Step 5: Detect Interactions ...

extendpeak output/out_peaks.narrowPeak 500 output/out_peaks.slopPeak
Running  extendpeak...
bedtools slop -i output/out_peaks.narrowPeak -g hg38.chrom.sizes -b 500 | bedtools merge -d 256 > output/out_peaks.slopPeak
awk 'BEGIN{OFS="\t";i=1}{print $1,$2,$3,"peak_"i;i=i+1}' output/out_peaks.slopPeak > tmp.bed; mv tmp.bed output/out_peaks.slopPeak
peak_depth2 output/out.rmdup.bedpe.tag 100 output/out_peaks.slopPeak output/out.peaks.depth
Running peakdepth...
psort output/out.rmdup.bedpe.tag output/out.rmdup.bedpe.tag.sorted > /dev/null 2>&1
bedtools coverage -sorted -b output/out.rmdup.bedpe.tag.sorted -a output/out_peaks.slopPeak -counts > output/out.peaks.depth
ERROR: Sort order was unspecified, and file output/out.rmdup.bedpe.tag.sorted is not sorted lexicographically.
       Please re-reun with the -g option for a genome file.
       See documentation for details.

hg38.chrom.sizes is this file: https://github.com/igvteam/igv/blob/master/genomes/sizes/hg38.chrom.sizes

commented

This error seems to be caused by psort. Could you check if the file output/out.rmdup.bedpe.tag.sorted is empty or damaged?

It is not empty
screen shot 2017-04-07 at 9 52 43 am

The last 3 lines were responsible for the error. If I delete them I can run it bedtools.

screen shot 2017-04-07 at 10 05 00 am

I erased those lines manully and continue the pipeline.

But then I got:


[04-07 14:49:12] Running Step 6: QCplots ...

Please check input or Rerun step 6

[04-07 14:49:12] Running Step 7: MICC ...

Rscript /home/tiagochst/bin/ChIA-PET2_0.9.2/bin/MICC2.R output/out.interactions.intra.bedpe output/out.interactions.inter.bedpe output/out.interactions.MICC 2
Loading required package: VGAM
Loading required package: methods
Loading required package: stats4
Loading required package: splines
Running MICC...
Loading intra data...
Loading inter data...
Cacluating...
Error: Don't have enough confident interactions to learn the model.

Please could you help me? I'm not sure if I did something wrong.

My --forward and --reverse files are both fastq in Biological replicate 1 (https://www.encodeproject.org/experiments/ENCSR000CAA/)

For --genome I was using GDC files

For --bedtoolsgenome I used this file
As you said -d is 1 due to the short reads

My final command was:
ChIA-PET2 -g GRCh38.d1.vd1.fa -b hg38.chrom.sizes -f ENCFF000LAF.fastq.gz -r ENCFF000KZT.fastq.gz -d 1 -s 1

I removed the psort and the -sort argument and continue step 5 manually. One of the replicates worked.