Todos & progress for the development
AdeBC opened this issue · comments
Hui-Chong commented
Update on home page
- Create a new tab in Help page for introductions of AMPSphere.
- The first column, by data type: genomes and metagenomes, AMPs and families.
- Statistics (maybe represented using a plot) and Search box (MMseqs and HMMSearch) go to home page, and two buttons to select search method.
- Create a Krona plot for AMP distribution across environments.
Potential icon resources: https://www.g2.com/products/biorender/competitors/alternatives
Update on AMP page
- Two numbers for genomes and metagenomes in
General information
- Two subsections for graphs: Comparison with entire database and Features for the current AMP
- Secondary structure: pie chart (the rest is
disordered
) - Sunburst plots for the distribution across habitats and hosts (all linages)
- Titles for the distribution plots.
- Download button: download as a pdf or an HTML.
- Associated metagenomes and proGenomes2 genomes
- Basic information (a table with three different columns: GMSC SAMPLE/GENOME TAXONOMIC_GENE_ORIGIN)
- Geographical distribution (colors are determined by microontology level I)
- Pep-fold for 3D structure prediction.
- Outline navigation on the left side.
- Figure captions for amp graphs.
Microontology
air 3681
anthropogenic 4055
anthropogenic:built environment 21418
anthropogenic:food 7
anthropogenic:food:fermented food 1848
anthropogenic:mine 555
anthropogenic:mock community 106
anthropogenic:sludge 33206
anthropogenic:wastewater 39952
aquatic 2093
aquatic:estuarine 4521
aquatic:freshwater 55769
aquatic:freshwater:lake 31024
aquatic:freshwater:pond 458
aquatic:freshwater:river 13313
aquatic:groundwater 7670
aquatic:ice 3214
aquatic:marine 96479
aquatic:marine:pelagic 43051
aquatic:saline 19152
aquatic:spring:hot spring 2304
aquatic:spring:hydrothermal vent 1523
host-associated:algal host 232
host-associated:animal host 19063
host-associated:animal host:coral reef 3407
host-associated:animal host:digestive tract 654
host-associated:animal host:digestive tract:intestine 218637
host-associated:animal host:digestive tract:mouth 11402
host-associated:animal host:digestive tract:mouth:saliva 7458
host-associated:animal host:digestive tract:rumen 41552
host-associated:animal host:insect host 2854
host-associated:animal host:mammalian host:human host 271
host-associated:animal host:reproductive tract 197
host-associated:animal host:respiratory tract 3377
host-associated:animal host:skin 6535
host-associated:animal host:urogenital tract 109
host-associated:plant host 6249
host-associated:plant host:leaf 2993
host-associated:plant host:phyllosphere 5264
host-associated:plant host:plant litter 9896
host-associated:plant host:rhizosphere 75929
nan 4094
ph:alkaline 1514
sediment 51020
terrestrial 4116
terrestrial:dust 50
terrestrial:halite 326
terrestrial:permafrost 9048
terrestrial:soil 240668
terrestrial:subsurface 1806
terrestrial:wetland 20207
Hui-Chong commented
Potential useful resources
Hui-Chong commented
Update on AMP page (frontend)
- Add
"branchvalues": 'total'
for sunburst plots ref. - Retrieve host lineages using living-tree-toolkit from NCBI taxonomy ID (backend).
- Smaller secondary structure pie chart, make the relationships table full-width.
- Fix the colors in geo-distribution map (backend).
- Round the number of biochemical features to 3 decimals (backend).
- Distribution: habitats and hosts -> three distribution graphs: habitats, hosts and origins (switch using Plot.js buttons)
- Sidebar: fix the issue on
default-openeds
. - Implement/adopt layout for line/bar/volcano plot.
- Browse page: a single table that can be filtered using host, habitat, family, etc.
Hui-Chong commented
About 3D structure prediction @celiosantosjr
Pep-fold has neither a standalone version that can be used locally, nor an API service that can be called regularly, so embedding Pep-fold3 in our server is not possible for now.
I'll try to find another 3d structure prediction approach for such embedding.
If you have such good resources, please attach them here and I'll also try them. Thanks!
Hui-Chong commented
- Just keep three tables in the AMP page: GMSC, Genome/Sample, Scientific name.
Hui-Chong commented
- Merge complete_table_origins.tsv.gz, fna table, Metadata (db), and AMP (db) and generate a single table in the database
Hui-Chong commented
mmseqs easy-search QUERY.fasta AMPSphere/AMPSphere_v.2021-03.faa.gz alnRes.m8 tmp
Hui-Chong commented
- Implement helical wheel generating on the frontend side (mimic the modlamp api).
Hui-Chong commented
PRIORITY
Already finished
- open a new API to get overall statistics of the entire ampsphere database
- implement distribution API
- test the backend API at a large scale - using pytest (https://fastapi.tiangolo.com/tutorial/testing/).
- integrate the backend and the frontend (fix the numbers on the home page).
- include a series of pagination buttons on the relationships table.
- include graphs (geoDistribution, habitatDistribution, hostDistribution, and PieChart - genomes and metagenomes) using the
Carousel
API of element plus.
DataBase (first week)
- create a statistics table #7
- refactor the repository structure of the database and necessary pre-computed files (finalize file structure)
- insert helical wheel paths in the AMP table
Backend (first week)
- include gene sequence in the
/v1/amps/{accession}
API - change response code to 200 when no AMPs or families can be returned using browse API
- Host backend to the AWS server
- add total number of items (and total pages) for each paginated API.
- implement the download API (only four tsv tables in the database, should be done once new server is ready)
- fix the colors in geo-distribution map (backend, lower priority).
Frontend
- update the AMP page to use real data from the backend (first week)
- include gene sequences in the downloaded relationships table
- implement a browse page and a search result page (integrated with the backend API)
- browse page: a single table that can be filtered using host, habitat, family, etc.
- implement a family page (integrated with the backend API)
- implement other pages (integrated with the backend API when necessary)
- download page: four buttons for four tsv tables in the database,
---
and link to the Zenodo repository - include Fudan, ISTBI, BDB logo and info...
- include badges indicating qualities of amps (similar to github badges)
- sidebar: fix the issue on default-openeds.
Performance optimization
- optimize text search performance, LOW PRIORITY
- finalize frontend layout: enhance responsiveness