whamg trouble - finishing instantly with no results

Question

whamg trouble - finishing instantly with no results

afkoeppel opened this issue 8 years ago · comments

Trying to call SVs on some rat sequences and want to try whamg.

whamg -a $GENOME –f SL150028.bwamem.unfilt.rmdup.sort.bam > SL150028.vcf 2> SL150028.err

Strangely, it seems to be finishing instantaneously (despite >50Gb bam file), and producing no results.

cat SL150028.err
INFO: fasta file: /home/sdt5z/genomes/igenomes/Rattus_norvegicus/Ensembl/Rnor_5.0/Sequence/WholeGenomeFasta/genome.fa
INFO: Loading discordant reads into forest.
INFO: Finished loading reads.
INFO: Gathering graphs from forest.
INFO: Matching breakpoints.
INFO: Printing.
INFO: done processing trees
INFO: WHAM finished normally, goodbye!

The output VCF file contains the VCF header lines but is otherwise empty.

I also tried with the -c option limiting to just a few chromosomes, but the results were the same.
Any advice about what I might be doing wrong would be appreciated.

Zev Kronenberg · Answer 1 · Thu Jun 16 2016 02:33:44 GMT+0800 (China Standard Time)

Odd.
Can you post the full error message?
This is whole genome paired end sequencing data?
Also, can you post the header of the bam file?
Specifically, do you have the SQ tag?

afkoeppel · Answer 2 · Thu Jun 16 2016 02:53:02 GMT+0800 (China Standard Time)

Well that's what's so odd, there was no error message. Nothing at all was echoed to the screen, and according to the .err file everything went fine, but there was no output (except for the VCF header) and it took no time to run.

samtools view -H SL150028.bwamem.unfilt.rmdup.sort.bam
@hd @sq @sq @sq @sq @sq @sq @sq @sq @sq @sq @sq @sq @sq @sq @sq @sq @sq @sq @sq @sq @sq @sq @rg @pg VN:1.3 SO:coordinate
SN:10 LN:112200500
SN:11 LN:93518069
SN:12 LN:54450796
SN:13 LN:118718031
SN:14 LN:115151701
SN:15 LN:114627140
SN:16 LN:90051983
SN:17 LN:92503511
SN:18 LN:87229863
SN:19 LN:72914587
SN:1 LN:290094216
SN:20 LN:57791882
SN:2 LN:285068071
SN:3 LN:183740530
SN:4 LN:248343840
SN:5 LN:177180328
SN:6 LN:156897508
SN:7 LN:143501887
SN:8 LN:132457389
SN:9 LN:121549591
SN:MT LN:16313
SN:X LN:154597545
ID:SL150028 SM:SL150028
ID:bwa PN:bwa CL:bwa mem /home/sdt5z/genomes/igenomes/Rattus_norvegicus/Ensembl/Rnor_5.0/Sequence/BWAIndex/genome.fa SL150028_1.fastq.gz SL150028_2.fastq.gz -t 20 -M -R @rg\tID:SL150028\tSM:SL150028 VN:0.7.12-r1039

Zev Kronenberg · Answer 3 · Thu Jun 16 2016 02:55:46 GMT+0800 (China Standard Time)

can you post the full SL150028.err error file?

There should be something like:

INFO: for Sample:xxx
STATS: UV940: mean depth ..............: 39.2523
STATS: UV940: sd depth ................: 10.7403
STATS: UV940: mean insert length: .....: 319.644
STATS: UV940: median insert length ....: 318
STATS: UV940: sd insert length ........: 69.2122
STATS: UV940: lower insert cutoff .....: 208.905
STATS: UV940: upper insert cutoff .....: 492.675
STATS: UV940: average base quality ....: 28.347
STATS: UV940: average mapping quality .: 59.0846
STATS: UV940: number of reads used ....: 100227

afkoeppel · Answer 4 · Thu Jun 16 2016 02:56:37 GMT+0800 (China Standard Time)

What I posted in the original message was the full file. I have only INFO fields, no STATS fields like in your example.

Zev Kronenberg · Answer 5 · Thu Jun 16 2016 03:10:08 GMT+0800 (China Standard Time)

@afkoeppel thanks, I was expecting it to be longer, but maybe not if it is an options parsing error.

original:

"whamg -a $GENOME **–**f SL150028.bwamem.unfilt.rmdup.sort.bam > SL150028.vcf 2> SL150028.err"

try:

"whamg -a $GENOME -f SL150028.bwamem.unfilt.rmdup.sort.bam > SL150028.vcf 2> SL150028.err"

Note the difference in the dash for -f.

If this solves the problem I'll add better error handling for this case.

afkoeppel · Answer 6 · Thu Jun 16 2016 03:16:29 GMT+0800 (China Standard Time)

Well I'll be. I replaced the dash and now it seems to be running. I'll update as soon as I have output. I'm not even quite sure how I got that non-dash dash, but I think maybe from copy-pasting from the instructions here (I sometimes copy example commands and then just replace the file names with my own).

Zev Kronenberg · Answer 7 · Thu Jun 16 2016 03:18:17 GMT+0800 (China Standard Time)

Shame on me! I'll make sure to check for m vs n dash.

http://www.punctuationmatters.com/hyphen-dash-n-dash-and-m-dash/

The wrong dash was in the README.

afkoeppel · Answer 8 · Thu Jun 16 2016 03:43:28 GMT+0800 (China Standard Time)

That did the trick. I have a vcf with SVs in it now. Thank you so much for your help.