Mouse genes cannot be queried by ensembl ID
cconow opened this issue · comments
querymany(["Eef2", "ENSMUSG00000034994", "ENSG00000167658"], species="mouse,human", scopes="_id,symbol,ensemblgene", fields="ensembl,symbol,genomic_pos", returnall=True)
Returns
{
"out": [
{
"query": "Eef2",
"_id": "1938",
"_score": 17.811184,
"ensembl": {
"gene": "ENSG00000167658",
"protein": [
"ENSP00000307940",
"ENSP00000471265"
],
"transcript": [
"ENST00000309311",
"ENST00000594885",
"ENST00000596417",
"ENST00000598182",
"ENST00000598436",
"ENST00000600720",
"ENST00000600794"
],
"translation": [
{
"protein": "ENSP00000471265",
"rna": "ENST00000600794"
},
{
"protein": "ENSP00000307940",
"rna": "ENST00000309311"
}
],
"type_of_gene": "protein_coding"
},
"genomic_pos": {
"chr": "19",
"end": 3985463,
"ensemblgene": "ENSG00000167658",
"start": 3976056,
"strand": -1
},
"symbol": "EEF2"
},
{
"query": "Eef2",
"_id": "13629",
"_score": 14.938413,
"genomic_pos": {
"chr": "10",
"end": 81018332,
"ensemblgene": "ENSMUSG00000034994",
"start": 81012465,
"strand": 1
},
"symbol": "Eef2"
},
{
"query": "ENSMUSG00000034994",
"notfound": true
},
{
"query": "ENSG00000167658",
"_id": "1938",
"_score": 26.22843,
"ensembl": {
"gene": "ENSG00000167658",
"protein": [
"ENSP00000307940",
"ENSP00000471265"
],
"transcript": [
"ENST00000309311",
"ENST00000594885",
"ENST00000596417",
"ENST00000598182",
"ENST00000598436",
"ENST00000600720",
"ENST00000600794"
],
"translation": [
{
"protein": "ENSP00000471265",
"rna": "ENST00000600794"
},
{
"protein": "ENSP00000307940",
"rna": "ENST00000309311"
}
],
"type_of_gene": "protein_coding"
},
"genomic_pos": {
"chr": "19",
"end": 3985463,
"ensemblgene": "ENSG00000167658",
"start": 3976056,
"strand": -1
},
"symbol": "EEF2"
}
],
"dup": [
[
"Eef2",
2
]
],
"missing": [
"ENSMUSG00000034994"
]
}
Which shows that querying human
for eef2 returns information in the ensembl
field, while mouse
does not. Additionally, searching by the ID works for human
but not for mouse
. This is consistent across all genes I have tried. Interestingly, genomic_pos
does still show the ensembl ID for mouse
.
@cconow thanks for letting us know. I can confirm this issue from this mouse gene record:
https://mygene.info/v3/gene/13629?fields=ensembl
returns empty, which is supposed to return a matching ensembl
field (e.g in this gene)
We had a similar issue for other genes too when updating data from the latest Ensembl v110
release recently, and we deployed a fix last week, but looks like there are still genes missing this fix. We will have a closer look and hope to fix them very soon.
I have deployed the fix to the Ensembl v110 release. Let me know if you have any problems. The links that were an issue are working now.
I also want to add a note that we are adding additional data tests to cover more species like mouse, rat etc., in addition to the human genes we covered in our current test suite. This issue we had seems only impact mouse genes, not human genes, so our current test procedure did not catch it before the data release.