freeseek / gtc2vcf

Tools to convert Illumina IDAT/BPM/EGT/GTC and Affymetrix CEL/CHP files to VCF

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

RUN bcftools +affy2vcf --models get some error . How do i fix it?

YunHisangTang opened this issue · comments

I'm sorry to bother you.
I got this error "Probe Set AX-82929059 not found in models file" when I run bcftools +affy2vcf.
How do i fix it? Thanks!

Could I not use this command --models xxxxxx.snp-posteriors.txt when I ran bcftools +affy2vcf. Any Different? => If I don't use --models command, I can get vcf file.


bcftools +affy2vcf
--csv ../APT-library/biobank/Axiom_BioBank1.na35.annot.csv
--fasta-ref ../resource-humanv37/human_g1k_v37.fasta
--calls ./GPS-step7-output/AxiomGT1.calls.txt
--confidences ./GPS-step7-output/AxiomGT1.confidences.txt
--summary ./GPS-step7-output/AxiomGT1.summary.txt
--models ./GPS-step7-output/AxiomGT1.snp-posteriors.txt
--output ./bcf-output/AxiomGT1.vcf

--- RUNNING LOG ---
Reading CSV file ../APT-library/biobank/Axiom_BioBank1.na35.annot.csv
Reading SNP file ./GPS-step7-output/AxiomGT1.snp-posteriors.txt
Writing VCF file
Probe Set AX-82929059 not found in models file

bcftools +affy2vcf
--csv ../APT-library/biobank/Axiom_BioBank1.na35.annot.csv
--fasta-ref ../resource-humanv37/human_g1k_v37.fasta
--chps ./GPS-step7-output/cc-chp/
--models ./GPS-step7-output/AxiomGT1.snp-posteriors.txt
--output bcf0517chp.vcf

--- RUNNING LOG ---
Reading CSV file ../APT-library/biobank/Axiom_BioBank1.na35.annot.csv
Reading CHP file ./GPS-step7-output/cc-chp//xxxxxxxxxxx.chp
...
Reading SNP file ./GPS-step7-output/AxiomGT1.snp-posteriors.txt
Writing VCF file
Probe Set AX-82929059 not found in models file

Can you run the following commands and tell me what you get?

grep AX-82929059 -A1 -B1 ./GPS-step7-output/AxiomGT1.calls.txt
grep AX-82929059 -A1 -B1 ./GPS-step7-output/AxiomGT1.snp-posteriors.txt

I am sure we can fix this though it is difficult to support Axiom arrays as there is no publically available data for me to test them.

Also, what command did you use to create the models file? Did you use the Axiom_BioBank1.r2.cdf file?

Thank you for your prompt reply.
I try to run these code.

  1. grep AX-82929059 -A1 -B1 ./GPS-step7-output/AxiomGT1.calls.txt

image

  1. grep AX-82929059 -A1 -B1 ./GPS-step7-output/AxiomGT1.snp-posteriors.txt
    no results

  2. I type these commands (It's user guide code p.86 Axiom® Genotyping Solution Data Analysis Guide):

apt-probeset-genotype
--log-file ./GPS-step7-output/apt-probset-genotype.log
--xml-file ../APT-library/biobank/Axiom_BioBank1_96orMore_Step2.r2.apt-probeset-genotype.AxiomGT1.xml
--analysis-files-path ../APT-library/biobank/
--out-dir ./GPS-step7-output/
--summaries
--write-models
--cc-chp-output
--cel-files ./cel_list1.txt
--force

  1. Yes. I use Axiom_BioBank1.r2.cdf file in Axiom_BioBank1_96orMore_Step2.r2.apt-probeset-genotype.AxiomGT1.xml
  • Axiom_BioBank1_96orMore_Step2.r2.apt-probeset-genotype.AxiomGT1.xml content :
    image

Okay, so the problem is that the AX-82929059 is in your calls file but it is not in your SNP models file. And in your file it follows the order found in the cdf file, as expected. This is quite puzzling. Why is this? Could the SNP models file be truncated? What happens if you run these two commands:

cat ./GPS-step7-output/AxiomGT1.calls.txt | grep -v ^# | wc -l
cat ./GPS-step7-output/AxiomGT1.snp-posteriors.txt | grep -v ^# | wc -l

I cannot see a reason for the SNP missing from the SNP models file.

Wait, maybe I can see what the issue is. The SNP might be missing from your SNP models file because all of your samples have missing genotypes. Is AX-82929059 the first such SNP with this property? This might require a thorough rethinking of how the SNP models file is processed ...

  1. I have 15 samples presenting in the following image with circles.
    Is it due to the './.' which corresponds to missing genotypes?
    Is it because APT doesn't recognize probe id(e.g. AFFX-SP-000001 in CEL file) to map the coordinate(CEL files to genotype calls files)? Thanks.

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Example1 Example2 ...
image

  1. I try to run these code.
cat ./GPS-step7-output/AxiomGT1.calls.txt | grep -v ^# | wc -l         // ->628680
cat ./GPS-step7-output/AxiomGT1.snp-posteriors.txt | grep -v ^# | wc -l     // ->395247

The latest version of affy2vcf should handle the case of missing SNP models. Can you give it a try and see if it works for you?

Thank you so much for your help.
The latest version of affy2vcf can help me run my APT files(calls and chp files) completely.
I will close this.