=================== == R E A D M E == =================== What follows is the order of operations in this pipeline, from a raw pdf of a metagenomics study (with a relevant SRA accession number) to the merged, concatenated, and screened fasta Note: This pipeline currently only works on 16S V4 sequences (though support for arbitrary sequences is a high priority going forward!) -------------------------- -- Order of operations: -- -------------------------- 1. Extract relevant SRA # from study PDF's -- script: ~/code/shell/bioinf/sraFinder.sh 2. Search online for SRA run table and metadata -- script: TBD -- [WORKAROUND]: type "!ncbi SRA<result from pdfshot.sh>" into DuckDuckGo 3. Use fastq-dump from SRA-toolkit to download the fastqs -- script: ~/code/shell/bioinf/from_barbera/srr_munch.sh 4. Filter fastqs to isolate the 16S V4 sequences -- script: ~/code/shell/bioinf/16S_extractor.sh 5. Merge forward and reverse fastqs -- script: ~/code/shell/bioinf/flash2_merge.sh 6. Convert fastqs to fastas with bbtools "reformat" -- script: ~/code/shell/bioinf/from_barbera/q2a_reformat.sh 7. Remove primers with cutadapt -- script: ~/code/shell/bioinf/primer_trim.sh [VERIFY] 8. Concatenate fastas -- script: ~/code/shell/bioinf/from_barbera/group_concat.sh [VERIFY] 9. Make group file for mothur -- script: ~/code/shell/bioinf/groupFormatter.sh 10. Make summary with mothur for combined fasta file -- script: TBD -- (Relevant Info): ~/work/usda/docs/ASD_metaanal_screening_fasta.txt 11. Screen Sequences with mothur -- script: ~/code/shell/bioinf/screening_batch.txt -- (Relevant Info): ~/work/usda/docs/ASD_metaanal_screening_fasta.txt 12. Screen for PhiX (as needed) -- script: ~/code/shell/bioinf/phix.sh -- (Relevant Info): ~/work/usda/docs/ASD_metaanal_screening_fasta.txt