liusihan / seGMM

A new tool to infer sex from massively parallel sequencing data.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use `--allow-extra-chr` in plink to accept chrX ALT contigs in VCF

Faizal-Eeman opened this issue · comments

seGMM failed to proceed with plink with an error,

>> Collected feature of X chromosome heterozygosity
When running: plink --vcf HG002.vcf --make-bed --out test_seGMM/plink
An error was occured, please check the parameters!

I tried the plink command outside of seGMM and it appears the problem lies when the input VCF contains ALT contigs like chrX_KI270880v1_alt. So an additional parameter --allow-extra-chr has to be passed to plink in order for it to accept this chromosome name.

Is there a way seGMM handles this? or if not, it would be helpful to add this.

Faizal

By default seGMM runs,

plink --vcf HG002.vcf --make-bed --out test_seGMM/

and this command fails to accept the chrX ALT contig.

But the below plink command worked,

plink --vcf HG002.vcf --make-bed --out test_seGMM/plink --allow-extra-chr

Hi Faizal-Eeman,

Thank you for bringing this to our attention. We appreciate your feedback. Just to clarify, ALT contigs present in the reference genome can sometimes result in multi-location mapping, which can impact the accuracy of variant calling. Therefore, we recommend considering removing ALT contigs from the reference genome prior to mapping to improve the accuracy of variant calling.

Regarding the calculation of the proportion of heterozygous variants on the X chromosome, seGMM uses plink to perform this task. In practice, you only need to provide the VCF file with the X chromosome. To address the issue of extracting variants located on autosomes and X chromosomes, you can use tools such as Vcftools or Bcftools.

We hope this helps clarify any confusion, and please let us know if you have any further questions or concerns.

Sihan

I also had same issue in that
but I debugged in main.py after that I got issue X map function can you resolve the function