foundation29org / RareCrowds

Package to serve public and freely-available data from rare disease patients.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Semantic intersection

PabloBotas opened this issue · comments

Currently the intersection is done in a hard way. If a symptom from OMIM is not present in the Orphanet annotations, then the symptom is removed from the disease phenotype list. However, symptoms may be defined in a different level of granularity: a symptom may be more precise in OMIM than in Orphanet or viceversa.

The lines following this one:

## REMOVE SYMPTOMS NOT PRESENT IN OMIM DISEASES
, instead of doing if k not in omim_phen:, the condition should consider successors and predecessors. So:

if k not in omim_phen and not any(k in hpo.predecessors(item) for item in omim_phen) and not any(item in omim_phen for item in hpo.predecessors(k)):

Meaning, if a symptom in Orphanet is a predecessor of a symptom in OMIM, then keep it. If a symptom in OMIM has a successor in the Orphanet symptoms, then keep the orphanet symptom as well.