Question: Calling non-variant sites
ja-Reeve opened this issue · comments
How can I get Octopus to output a VCF including non-variant sites?
Thanks. I tried that before on a short 600kb region of the genome, but it only returns variant sites.
My call:
octopus \
-R $PATH/ref_genome.fasta \
-i $PATH/bam_list.txt \
-o $PATH/Octopus_trial1.6b.vcf.gz \
-T Contig38698:0-608273 \
--very-fast --refcall POSITIONAL
Are you using the latest commit in the development branch?
What if you remove --very-fast ?
I tried a test run, but it timed out after 5days. I will give it another go with more time.
Maybe use samtools on a single sample to filter the bam file to just that region to test octopus without the --very-fast?
Maybe I do not understand, but these are non-variant sites as well in the output you show. For example Contig38698:1 has phased genotype 0|0 . The remaining sites in the interval Contig38698:2-13763 are presumably not passing filter? So, you might need to keep the raw calls with --keep-unfiltered-calls
and perhaps do some post processing to include the lines from the unfiltered file to the filtered file? I guess last time I used this, I realized I only wanted the confidently called sites, whether variant or non-variant.
Calling again with --keep-unfiltered-calls
worked. You're right, many sites were being removed due to the filters.
Thanks a lot for your help.
Great!