freeseek / gtc2vcf

Tools to convert Illumina IDAT/BPM/EGT/GTC and Affymetrix CEL/CHP files to VCF

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

compressed VCF

Ballote opened this issue · comments

Hi,

When I'm changing from CHP files to BCF this is the command:

  1. bcftools +affy2vcf \
  2. --no-version -Ou \
  3. --csv "GenomeWideSNP_6.na35.annot.csv" \
  4. --fasta-ref "human_g1k_v37.fasta" \
  5. --chps /home/user/project/cc-chp/NAME \
  6. --snp /home/user/project/AxiomGT1.snp-posteriors.txt \
  7. --extra NAME.tsv | \
  8. bcftools sort -Ou -T ./bcftools-sort.XXXXXX | \
  9. bcftools norm --no-version -Ob -o NAME.vcf -c x -f "human_g1k_v37.fasta" && \
  10. bcftools index -f NAME.vcf

I was wondering, if I want to change the format to VCF I need to change the lines 2, 8 and 9 to "-Ov", "-Ov" and "-Oz", respectively? I mean, because "-Ov" and "-Oz" is for VCF, instead of "-Ou" and "-Ob" that is for BCF format.

If this is correct, It would look like this:

  1. bcftools +affy2vcf \
  2. --no-version -Ov \
  3. --csv "GenomeWideSNP_6.na35.annot.csv" \
  4. --fasta-ref "human_g1k_v37.fasta" \
  5. --chps /home/user/project/cc-chp/NAME \
  6. --snp /home/user/project/AxiomGT1.snp-posteriors.txt \
  7. --extra NAME.tsv | \
  8. bcftools sort -Ov -T ./bcftools-sort.XXXXXX | \
  9. bcftools norm --no-version -Oz -o NAME.vcf -c x -f "human_g1k_v37.fasta" && \
  10. bcftools index -f NAME.vcf

When I run it in this way, I have the VCF file in the end, but also I have this message:

index: "NAME.vcf" is in a format that cannot be usefully indexed

I just want to know if the change is correct and if its correct, there is any way to index the file usefully?

You should still keep the -Ou for bcftools +affy2vcf and bcftools sort as uncompressed binary VCF is the fastest format for piping (though this is not advertised well enough). You can keep -Oz for the last bcftools norm command.

Your command should work as is, though it would likely be more appropriate to use the NAME.vcf.gz rather than NAME.vcf as you are outputting a compressed VCF with bcftools norm.

I'm going to do that

Thank you so much for your time