zwdzwd / transvar

TransVar - multiway annotator for precision medicine

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problems with UniProt support

grayfall opened this issue · comments

I'm having problems converting amino-acid substitutions to genomic coordinates using UniProt identifiers, e.g.

$ transvar panno -i 'Q9BXW4:p.G126A' --uniprot --ccds
input	transcript	gene	strand	coordinates(gDNA/cDNA/protein)	region	info
Q9BXW4:p.G126A	CCDS31074 (protein_coding)	MAP1LC3C	-	chr1:g.242159532C>G/c.377G>C/p.G126A	inside_[cds_in_exon_4]	CSQN=Missense;reference_codon=GGC;candidate_codons=GCA,GCC,GCG,GCT;candidate_mnv_variants=chr1:g.242159531_242159532delGCinsTG,chr1:g.242159531_242159532delGCinsCG,chr1:g.242159531_242159532delGCinsAG;source=CCDS

I'm getting the following output for quite a few proteins:

$ transvar panno -i 'Q99418:p.E156D' --uniprot --ccds
input	transcript	gene	strand	coordinates(gDNA/cDNA/protein)	region	info
[wrap_exception] warning: seek out of range
Q99418:p.E156D	.	.	.	././.	.	Error_seek out of range
[wrap_exception] warning: seek out of range
Q99418:p.E156D	.	.	.	././.	.	Error_seek out of range

I've tried different locations and substitutions (e.g. -i 'Q99418:1') to no avail. Here are some more affected proteins: P04180, P45381, P45381, Q14289, P49916. Am I doing something wrong?

Thanks for reporting. I have

$ transvar panno -i 'Q99418:p.E156D' --uniprot --ccds
input   transcript      gene    strand  coordinates(gDNA/cDNA/protein)  region  info
Q99418:p.E156D  CCDS12722 (protein_coding)      CYTH2   +       chr19:g.48977195G>C/c.468G>C/p.E156D    inside_[cds_in_exon_6]  CSQN=Missense;reference_codon=GAG;candidate_codons=GAC,GAT;candidate_snv_variants=chr19:g.48977195G>T;source=CCDS
Q99418:p.E156D  CCDS12722 (protein_coding)      CYTH2   +       chr19:g.48977195G>C/c.468G>C/p.E156D    inside_[cds_in_exon_6]  CSQN=Missense;reference_codon=GAG;candidate_codons=GAC,GAT;candidate_snv_variants=chr19:g.48977195G>T;source=CCDS

Have you tried with

transvar config --download_idmap

to get the Uniprot id maps?

Thanks,

@zwdzwd thanks for replying so quickly. Yes, I've downloaded the ID mapping, which is why some proteins do work.

(mutagenesis) $ transvar config --download_idmap
[downloading] ~/.cenvs/envs/mutagenesis/lib/python3.6/site-packages/transvar/transvar.download/uniprot.idmapping.txt.gz.idx ..Done (69.7 MB).

(mutagenesis) $ transvar panno -i 'Q99418:p.E156D' --uniprot --ccds
input	transcript	gene	strand	coordinates(gDNA/cDNA/protein)	region	info
[wrap_exception] warning: seek out of range
Q99418:p.E156D	.	.	.	././.	.	Error_seek out of range
[wrap_exception] warning: seek out of range
Q99418:p.E156D	.	.	.	././.	.	Error_seek out of range

(mutagenesis) $ transvar panno -i 'Q9BXW4:p.G126A' --uniprot --ccds
input	transcript	gene	strand	coordinates(gDNA/cDNA/protein)	region	info
Q9BXW4:p.G126A	CCDS31074 (protein_coding)	MAP1LC3C	-	chr1:g.242159532C>G/c.377G>C/p.G126A	inside_[cds_in_exon_4]	CSQN=Missense;reference_codon=GGC;candidate_codons=GCA,GCC,GCG,GCT;candidate_mnv_variants=chr1:g.242159531_242159532delGCinsTG,chr1:g.242159531_242159532delGCinsCG,chr1:g.242159531_242159532delGCinsAG;source=CCDS

(mutagenesis) $ transvar --version
TransVar Version 2.3.4.20161215

(mutagenesis) $ python --version
Python 3.6.6 :: Anaconda, Inc.

Hi @grayfall ,

Could you try the latest version? I tried with your Q99418 example and it seems to be working.

$ transvar panno -i 'Q99418:p.E156D' --uniprot --ccds
input   transcript      gene    strand  coordinates(gDNA/cDNA/protein)  region  info
Q99418:p.E156D  CCDS12722 (protein_coding)      CYTH2   +       chr19:g.48977195G>C/c.468G>C/p.E156D    inside_[cds_in_exon_6]  CSQN=Missense;reference_codon=GAG;candidate_codons=GAC,GAT;candidate_snv_variants=chr19:g.48977195G>T;source=CCDS
Q99418:p.E156D  CCDS12722 (protein_coding)      CYTH2   +       chr19:g.48977195G>C/c.468G>C/p.E156D    inside_[cds_in_exon_6]  CSQN=Missense;reference_codon=GAG;candidate_codons=GAC,GAT;candidate_snv_variants=chr19:g.48977195G>T;source=CCDS
$ transvar config
Reference version: hg19
$ transvar --version
TransVar Version 2.4.0.20180701
$ python --version
Python 3.6.3 :: Anaconda, Inc.

@zwdzwd yes, the new release seems to work fine. Thank you.