ohsu-comp-bio / g2p-aggregator

Associations of genomic features, drugs and diseases

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Jax CKB disease terms incorrect

ahwagner opened this issue · comments

Currently, 87% of associations from Jax CKB have DOID:162 - Cancer. This isn't correct. Other resources are much more reasonable at 0-7%.

On reloading with the most recent data release, this number is closer to 24%--I'm not sure why this changed so dramatically (possible error in my original query?), but it is still far above our other resources.

commented

The following terms are mapped to the generic term cancer

$ grep "\tCancer" disease_alias.tsv
Advanced Solid Tumor	Cancer
Solid tumor	Cancer
All Solid Tumors	Cancer
Any cancer type	Cancer
Solid tumors	Cancer
Malignant neoplastic disease	Cancer
All Tumors	Cancer
commented

Regarding jax:

$ cat  jax.json | jq '.jax | .indication.name' | grep 'Advanced Solid Tumor' | wc -l
    1330
commented

That is 1330 out of 5754.

$ cat  jax.json | jq '.jax | .indication.name'  | wc -l
    5754

These numbers are accurate per discussion with Sara Patterson at CKB. Resolving.