BIGGDATA is a web portal for analyzing IG and TCR repertoire data, originally built for the Brent Iverson George Georgiou (BIGG) Laboratory.
The goal of this system is twofold: to curate and provide a central repository for sequencing data collected by the lab, and to standardize analysis of IG and TCR repertoire data.
Users are able to import data from local files, server files, NCBI SRA, web URLs or UT Austin GSAF sequencing core directories, preprocess sequencing data (quality & illumina adapter trimming, quality filtering, paired-end overlap consensus) and run a variety of analysis programs (MixCR, IGFFT, Abstar, etc) to annotate reads from IMGT and native databases. Paired VH::VL (or TRB::TRA) sequencing analysis is now supported as well.
All annotation results are standardized to facilitate downstrem analysis and generate useful insights into repertoire distribution, polarization, and loci & gene usage. Comparative analysis of results between sample datasets includes a variety of repertoire similarity metrics and co-clustering to determine clonal overlap.
python2.7 w/ Flask
rabbitmq-server
PostgreSQL Database
Celery task executor
Other python modules listed in requirements.txt
Trimmomatic
fastx_toolkit
Pandaseq
MiXCR
IGFFT
AbSTAR
rabbitmq-server
testing: celery -A app.celery worker --loglevel=debug
production: celery multi restart node1 --verbose -A app.celery --loglevel=info
flower --port=8001
python manage.py runserver -p 8000 -h 0.0.0.0