weekmo / biodb_team3

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

BioDB Repository for Team-3

Working title under construction

Abstract

In course of the BioDB Labcourse project we planned a framework to compare disease based on pathway and phenotype overlap. For this 3 databases need to be downloaded, structured and turned into easily queriable SQLite databases. A python package should provide the necessary functions.

plan

Databases

The preliminary chosen databases are:

  • Uniprot (Team 3)
  • Wikipathways, Genenames(Team 2)
  • Monarch (Team 1)

Team task

We as team 3 need to design and implement a database, that connects diseases (OMIM ID's) to proteins (Uniprot accessions). The database should be easily queryable

Steps of team project development

1- Choose the biomedical database and obtain raw data (txt)

2- Design and implement a data model in python (sqlalchemy)

3- provide a python package that handles querying the database

4- Query the database

4- Document the project (wiki page, doctest, comments)

Current Workflow

  • Final decision on the protein database (Uniprot)
  • Plan data model
  • Download Uniprot data (CSV file)
  • Implement a function to parse the disease column of uniprot data csv-file
  • Parse: http://www.uniprot.org/docs/mimtosp.txt into a list
  • Implement the data model in sqlalchemy
  • Populate the database using sqlalchemy
  • Setup packages
  • Query functions
  • Document the code (wiki page)

Team-3 Members

  • Mohammed
  • (5 other students, names are hiden for privacy)

About


Languages

Language:Jupyter Notebook 67.2%Language:HTML 32.4%Language:Python 0.4%Language:Makefile 0.0%Language:Shell 0.0%