--refcall POSITIONAL reports * instead of . in ALT column for reference call creating triallelic calls downstream
Kelzor opened this issue · comments
Describe the bug
When I merge vcfs generated using --refcall POSITIONAL and -P 2, I get triallelic genotype calls (2|0, 0|2) in the resulting merged vcf file that are not present in the individual vcf files. I think is has to do with how bcftools treats "*" in the ALT column. Is there a simple way to change the * to . so that bcftools identifies the ALT as a reference call and not an alternative allele?
For example, in the single .vcf, this is the result:
MTB_anc 1143 . G A
1|0:96:21:37:1104:100:15,6:0.714,0.286:0.214,0.214:0.363:41,32:21:PASS
In the merged .vcf, this is the result for the same sample (first) and the other two that have been merged, which are both homozygous reference:
MTB_anc 1143 . G *,A
2|0:96:21:37:15,.,6:0.714,.,0.286:0.214,.,0.214:0.363:41,.,32:21:PASS:1104:100
0|0:177:59:37:59,.,.:1,.,.:0,.,.:.:41,.,.:59:PASS:.:.
0|0:75:25:37:25,.,.:1,.,.:0,.,.:.:41,.,.:25:PASS:.:.
Attached is a screenshot of the same three samples in IGV for that position. You can see the calls should be 0|1, 0|0, 0|0
This will become a problem for me at sites that are actually triallelic. Thank you!
Version
$ octopus --version
octopus version 0.7.4
Target: x86_64 Linux 5.10.25-linuxkit
SIMD extension: AVX2
Compiler: GNU 11.1.0
Boost: 1_76
Command
Command line to install octopus:
$ singularity build octopus.sif docker://dancooke/octopus
Command line to run octopus:
$ singularity run -B /data/stonelab:/data/stonelab /home/keblevin/octopus.sif \
--reference /data/stonelab/references/MTB_ancestor/MTB_ancestor.fasta \
-I /data/stonelab/Kelly_TB/BAMS_BAMS_BAMS/Vagene_et_al_bams/Colombia_1321_4402.bam \
-P 2 \
--refcall POSITIONAL \
--annotations \
--threads \
--output "19-4-23-Col1321-refcall-2P-annotations.vcf"