VCF file is empty
Axze-rgb opened this issue · comments
Describe the bug
The vcf file is empty, calling in individual (germline) mode
Version
$ octopus --version
[2023-09-12 20:51:11] <INFO> ------------------------------------------------------------------------
[2023-09-12 20:51:11] <INFO> octopus v0.7.4 (ed012a6e)
[2023-09-12 20:51:11] <INFO> Copyright (c) 2015-2021 University of Oxford
[2023-09-12 20:51:11] <INFO> ------------------------------------------------------------------------
[2023-09-12 20:51:11] <EROR> A user error has occurred:
[2023-09-12 20:51:11] <EROR>
[2023-09-12 20:51:11] <EROR> The option you specified '--version' is not recognised.
[2023-09-12 20:51:11] <EROR>
[2023-09-12 20:51:11] <EROR> To help resolve this error use the --help command to view required and
[2023-09-12 20:51:11] <EROR> allowable options.
[2023-09-12 20:51:11] <INFO> ------------------------------------------------------------------------
Command
Command line to install octopus:
git clone https://github.com/luntergroup/octopus.git
scripts/install.py #all dependencies were in the PATH
Command line to run octopus:
octopus -R vaga.fa --threads 22 -C individual -P 2 -B 64Gb -I ancestor.sorted.bam -o Octopus_call/test.vcf.gz --bad-region-tolerance LOW --consider-unmapped-reads --min-supporting-reads 3 --forest-model /home/alessandro/software/octopus/resources/forests/germline.v0.7.4.forest.gz
Additional context
just a normal fasta with a bam obtained from mapping with bwa-mem
bwa command:
while read p; do bwa-mem2 mem -R '@RG\tID:foo\tSM:bar' -k 12 -t22 vaga.fa ${p}.R1.fastq.gz ${p}.R2.fastq.gz|samtools sort -m 2G -@24 -o bwa_k10_mapping/${p}.sorted.bam; done <ID
Any idea as why my vcf is empty? The run time is also extremely short, less than 5 minutes, which I find surprising. It makes me the impression it's skipping steps it shouldn't.
I checked a pileup, just in case there would be 0 variant in the alignement (which wouldn't make any sense but who knows, maybe a wrong manipulation), but there are variants.
I can send you the bam file if you wish so.
Thanks a lot
EDIT: removing the -C tag seems to solve the issue. That being said, the time estimate for the analysis is very long. Is there anyway to speed things up?
[2023-09-12 21:33:28] <INFO> ------------------------------------------------------------------------
[2023-09-12 21:51:19] <INFO> Chrom_1:3107486 0.5% 17m 50s 2d 11h
[2023-09-12 21:55:49] <INFO> Chrom_1:4596832 1.0% 22m 21s 1d 12h
[2023-09-12 22:00:20] <INFO> Chrom_3:5244696 1.5% 26m 52s 1d 5h
[2023-09-12 22:01:11] <INFO> Chrom_3:15532716 1.6% 27m 42s 1d 4h
[2023-09-12 22:02:13] <INFO> Chrom_3:15535861 1.7% 28m 44s 1d 3h
[2023-09-12 22:03:06] <INFO> Chrom_1:4640543 1.8% 29m 37s 1d 2h
[2023-09-12 22:04:12] <INFO> Chrom_3:2097164 1.9% 30m 43s 1d 2h
[2023-09-12 22:05:01] <INFO> Chrom_3:9709436 2.0% 31m 33s 1d 1h
[2023-09-12 22:05:37] <INFO> Chrom_3:17116716 2.1% 32m 9s 24h 57m
[2023-09-12 22:06:36] <INFO> Chrom_1:4655724 2.2% 33m 7s 24h 31m
[2023-09-12 22:07:31] <INFO> Chrom_1:8911072 2.3% 34m 2s 24h 4m
[2023-09-12 22:08:31] <INFO> Chrom_1:10322451 2.4% 35m 2s 23h 43m
[2023-09-12 22:09:29] <INFO> Chrom_3:8281491 2.5% 36m 23h 23m
[2023-09-12 22:10:27] <INFO> Chrom_1:10323600 2.6% 36m 59s 23h 4m
EDIT; there might be an issue with the uploaded forest file, see here: #259 (comment)
is it the correct file?
EDIT: indeed the forest files can't be downloaded because of data size on github, therefore Octopus in forest mode has nothing to classify anything.