Pinjontall94 / bioinf

Bioinformatics Scripts (originally for Anaerobic Soil Decomposition metagenomics pipeline) [Mirrored from GitLab!]

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

===================
== R E A D   M E ==
===================

What follows is the order of operations in this pipeline, from a raw pdf of a
metagenomics study (with a relevant SRA accession number) to the merged,
concatenated, and screened fasta 

Note: This pipeline currently only works on 16S V4 sequences (though support
for arbitrary sequences is a high priority going forward!)

--------------------------
-- Order of operations: --
--------------------------

1. Extract relevant SRA # from study PDF's
	-- script: ~/code/shell/bioinf/sraFinder.sh

2. Search online for SRA run table and metadata
	-- script: TBD
	-- [WORKAROUND]: type "!ncbi SRA<result from pdfshot.sh>" into DuckDuckGo

3. Use fastq-dump from SRA-toolkit to download the fastqs
	-- script: ~/code/shell/bioinf/from_barbera/srr_munch.sh

4. Filter fastqs to isolate the 16S V4 sequences
	-- script: ~/code/shell/bioinf/16S_extractor.sh

5. Merge forward and reverse fastqs
	-- script: ~/code/shell/bioinf/flash2_merge.sh

6. Convert fastqs to fastas with bbtools "reformat"
	-- script: ~/code/shell/bioinf/from_barbera/q2a_reformat.sh

7. Remove primers with cutadapt
	-- script: ~/code/shell/bioinf/primer_trim.sh [VERIFY]

8. Concatenate fastas 
	-- script: ~/code/shell/bioinf/from_barbera/group_concat.sh [VERIFY]

9. Make group file for mothur
	-- script: ~/code/shell/bioinf/groupFormatter.sh

10. Make summary with mothur for combined fasta file
	-- script: TBD
	-- (Relevant Info): ~/work/usda/docs/ASD_metaanal_screening_fasta.txt

11. Screen Sequences with mothur
	-- script: ~/code/shell/bioinf/screening_batch.txt
	-- (Relevant Info): ~/work/usda/docs/ASD_metaanal_screening_fasta.txt
 
12. Screen for PhiX (as needed)
	-- script: ~/code/shell/bioinf/phix.sh
	-- (Relevant Info): ~/work/usda/docs/ASD_metaanal_screening_fasta.txt

About

Bioinformatics Scripts (originally for Anaerobic Soil Decomposition metagenomics pipeline) [Mirrored from GitLab!]

License:GNU Affero General Public License v3.0


Languages

Language:Shell 95.1%Language:Perl 4.9%