obophenotype / upheno

The Unified Phenotype Ontology (uPheno) integrates multiple phenotype ontologies into a unified cross-species phenotype ontology.

Home Page:https://obophenotype.github.io/upheno/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Create a uPheno release product for data analysis

matentzn opened this issue · comments

@pnrobinson requested a uPheno release product that we should add to the uPheno2 release before the end of March. Given this picture

image

I hope I understood correctly @pnrobinson that, given the above picture, you want the following table:

taxon upheno_id original_phenotype gene human_orthologue
NCBITaxon:10090 UPHENO:0034327 MP:0030719 ncbi.gene:68646 hgnc:26404

Is this correct?

@matentzn this table would be exactly what we need!

First draft

Code to generate Table
from neo4j import GraphDatabase

# Connect to the Neo4j database
bolt_url = "ASK_NICO"
driver = GraphDatabase.driver(bolt_url)

# Define the Cypher query
query = """
MATCH
(upheno:`biolink:PhenotypicFeature` WHERE upheno.id STARTS WITH "UPHENO:")<-[:`biolink:subclass_of`]-(phenotype:`biolink:PhenotypicFeature`)<-[gena:`biolink:has_phenotype`]-(gene:`biolink:Gene`)-[:`biolink:orthologous_to`]-(human_gene:`biolink:Gene` WHERE "NCBITaxon:9606" IN human_gene.in_taxon)
RETURN 
    upheno.id, 
    phenotype.id, 
    gene.id, 
    gena.negated,
    CASE WHEN gene.in_taxon IS NOT NULL AND size(gene.in_taxon) > 0 
         THEN REDUCE(s = "", x IN gene.in_taxon | s + x + CASE WHEN x <> gene.in_taxon[size(gene.in_taxon)-1] THEN "|" ELSE "" END) 
         ELSE "" END AS gene_in_taxon, 
    human_gene.id,
    gena.primary_knowledge_source,
    gena.publications
"""

# Run the query and print the results
data = []
with driver.session() as session:
    results = session.run(query)
    for record in results:
        data.append(record)

import pandas as pd
df = pd.DataFrame(data, columns=["upheno_grouping", "phenotype", "gene", "negated", "taxon", "human_orthologue", "source", "publications"])
df

Draft result:

upheno_grouping phenotype gene negated taxon human_orthologue source publications
UPHENO:0000508 ZP:0000606 ZFIN:ZDB-GENE-040426-1675 NCBITaxon:7955 HGNC:9721 infores:zfin ['ZFIN:ZDB-PUB-170311-8']
UPHENO:0000508 ZP:0000606 ZFIN:ZDB-GENE-040426-1675 NCBITaxon:7955 HGNC:30262 infores:zfin ['ZFIN:ZDB-PUB-170311-8']
UPHENO:0000508 WBPhenotype:0000848 WB:WBGene00044068 NCBITaxon:6239 HGNC:12927 infores:wormbase ['PMID:16803962']
UPHENO:0000508 WBPhenotype:0000848 WB:WBGene00009178 NCBITaxon:6239 HGNC:15664 infores:wormbase ['PMID:22073243']
UPHENO:0000508 WBPhenotype:0000848 WB:WBGene00009178 NCBITaxon:6239 HGNC:15663 infores:wormbase ['PMID:22073243']
UPHENO:0000508 WBPhenotype:0000848 WB:WBGene00000914 NCBITaxon:6239 HGNC:9984 infores:wormbase ['PMID:29301909']
UPHENO:0000508 WBPhenotype:0000848 WB:WBGene00000914 NCBITaxon:6239 HGNC:9983 infores:wormbase ['PMID:29301909']
UPHENO:0000508 WBPhenotype:0000848 WB:WBGene00000914 NCBITaxon:6239 HGNC:9982 infores:wormbase ['PMID:29301909']
UPHENO:0000508 WBPhenotype:0000848 WB:WBGene00022620 NCBITaxon:6239 HGNC:20165 infores:wormbase ['PMID:25635455']
UPHENO:0000508 WBPhenotype:0000848 WB:WBGene00022620 NCBITaxon:6239 HGNC:17407 infores:wormbase ['PMID:25635455']

@pnrobinson if this works for you, you can do a first experiment with this table:

https://www.dropbox.com/scl/fi/zbjt48afy4efkbki8szy5/upheno_gene_human_orthologues.tsv?rlkey=yr0vl7ky3ldeaura8kllagubn&dl=0

@kevinschaper did all the heavy lifting, so THANK YOU!