BigDataBiology / AMPSphereWebsite

Website for global antimicrobial peptides.

Home Page:https://ampsphere.big-data-biology.org/home

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Todos & progress for the development

AdeBC opened this issue · comments

Update on home page

  • Create a new tab in Help page for introductions of AMPSphere.
  • The first column, by data type: genomes and metagenomes, AMPs and families.
  • Statistics (maybe represented using a plot) and Search box (MMseqs and HMMSearch) go to home page, and two buttons to select search method.
  • Create a Krona plot for AMP distribution across environments.

Potential icon resources: https://www.g2.com/products/biorender/competitors/alternatives

Update on AMP page

  • Two numbers for genomes and metagenomes in General information
  • Two subsections for graphs: Comparison with entire database and Features for the current AMP
  • Secondary structure: pie chart (the rest is disordered)
  • Sunburst plots for the distribution across habitats and hosts (all linages)
  • Titles for the distribution plots.
  • Download button: download as a pdf or an HTML.
  • Associated metagenomes and proGenomes2 genomes
  • Basic information (a table with three different columns: GMSC SAMPLE/GENOME TAXONOMIC_GENE_ORIGIN)
  • Geographical distribution (colors are determined by microontology level I)
  • Pep-fold for 3D structure prediction.
  • Outline navigation on the left side.
  • Figure captions for amp graphs.

Microontology

air	3681
anthropogenic	4055
anthropogenic:built environment	21418
anthropogenic:food	7
anthropogenic:food:fermented food	1848
anthropogenic:mine	555
anthropogenic:mock community	106
anthropogenic:sludge	33206
anthropogenic:wastewater	39952
aquatic	2093
aquatic:estuarine	4521
aquatic:freshwater	55769
aquatic:freshwater:lake	31024
aquatic:freshwater:pond	458
aquatic:freshwater:river	13313
aquatic:groundwater	7670
aquatic:ice	3214
aquatic:marine	96479
aquatic:marine:pelagic	43051
aquatic:saline	19152
aquatic:spring:hot spring	2304
aquatic:spring:hydrothermal vent	1523
host-associated:algal host	232
host-associated:animal host	19063
host-associated:animal host:coral reef	3407
host-associated:animal host:digestive tract	654
host-associated:animal host:digestive tract:intestine	218637
host-associated:animal host:digestive tract:mouth	11402
host-associated:animal host:digestive tract:mouth:saliva	7458
host-associated:animal host:digestive tract:rumen	41552
host-associated:animal host:insect host	2854
host-associated:animal host:mammalian host:human host	271
host-associated:animal host:reproductive tract	197
host-associated:animal host:respiratory tract	3377
host-associated:animal host:skin	6535
host-associated:animal host:urogenital tract	109
host-associated:plant host	6249
host-associated:plant host:leaf	2993
host-associated:plant host:phyllosphere	5264
host-associated:plant host:plant litter	9896
host-associated:plant host:rhizosphere	75929
nan	4094
ph:alkaline	1514
sediment	51020
terrestrial	4116
terrestrial:dust	50
terrestrial:halite	326
terrestrial:permafrost	9048
terrestrial:soil	240668
terrestrial:subsurface	1806
terrestrial:wetland	20207

Potential useful resources

Update on AMP page (frontend)

  • Add "branchvalues": 'total' for sunburst plots ref.
  • Retrieve host lineages using living-tree-toolkit from NCBI taxonomy ID (backend).
  • Smaller secondary structure pie chart, make the relationships table full-width.
  • Fix the colors in geo-distribution map (backend).
  • Round the number of biochemical features to 3 decimals (backend).
  • Distribution: habitats and hosts -> three distribution graphs: habitats, hosts and origins (switch using Plot.js buttons)
  • Sidebar: fix the issue on default-openeds.
  • Implement/adopt layout for line/bar/volcano plot.
  • Browse page: a single table that can be filtered using host, habitat, family, etc.

About 3D structure prediction @celiosantosjr

Pep-fold has neither a standalone version that can be used locally, nor an API service that can be called regularly, so embedding Pep-fold3 in our server is not possible for now.
I'll try to find another 3d structure prediction approach for such embedding.
If you have such good resources, please attach them here and I'll also try them. Thanks!

  • Just keep three tables in the AMP page: GMSC, Genome/Sample, Scientific name.
  • Merge complete_table_origins.tsv.gz, fna table, Metadata (db), and AMP (db) and generate a single table in the database
mmseqs easy-search QUERY.fasta AMPSphere/AMPSphere_v.2021-03.faa.gz alnRes.m8 tmp
  • Implement helical wheel generating on the frontend side (mimic the modlamp api).

PRIORITY

Already finished

  • open a new API to get overall statistics of the entire ampsphere database
  • implement distribution API
  • test the backend API at a large scale - using pytest (https://fastapi.tiangolo.com/tutorial/testing/).
  • integrate the backend and the frontend (fix the numbers on the home page).
  • include a series of pagination buttons on the relationships table.
  • include graphs (geoDistribution, habitatDistribution, hostDistribution, and PieChart - genomes and metagenomes) using the Carousel API of element plus.

DataBase (first week)

  • create a statistics table #7
  • refactor the repository structure of the database and necessary pre-computed files (finalize file structure)
  • insert helical wheel paths in the AMP table

Backend (first week)

  • include gene sequence in the /v1/amps/{accession} API
  • change response code to 200 when no AMPs or families can be returned using browse API
  • Host backend to the AWS server
  • add total number of items (and total pages) for each paginated API.
  • implement the download API (only four tsv tables in the database, should be done once new server is ready)
  • fix the colors in geo-distribution map (backend, lower priority).

Frontend

  • update the AMP page to use real data from the backend (first week)
  • include gene sequences in the downloaded relationships table
  • implement a browse page and a search result page (integrated with the backend API)
  • browse page: a single table that can be filtered using host, habitat, family, etc.
  • implement a family page (integrated with the backend API)
  • implement other pages (integrated with the backend API when necessary)
  • download page: four buttons for four tsv tables in the database, --- and link to the Zenodo repository
  • include Fudan, ISTBI, BDB logo and info...
  • include badges indicating qualities of amps (similar to github badges)
  • sidebar: fix the issue on default-openeds.

Performance optimization

  • optimize text search performance, LOW PRIORITY
  • finalize frontend layout: enhance responsiveness