Arkadiy-Garber / bvcn-binder-htseq

Lesson and tutorial on generating count tables from read-mapping data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Binder for BVCN Transcriptomics lesson 3

Initially forked from here. Thank you to the awesome binder team!

Binder

Part of the Bioinformatics Virtual Coordination Network :)

Walkthrough

Build bowtie2 index out of the Tremblaya genome

bowtie2-build -f Candidatus-Tremblaya-princeps_PCIT.fa Candidatus-Tremblaya-princeps_PCIT.fa

Map sample reads to the index using bowtie2

bowtie2 -x Candidatus-Tremblaya-princeps_PCIT.fa -U PCIT.sample.fastq -S Candidatus-Tremblaya-princeps_PCIT.sam --no-unal

Generate GFF file using Prokka

prokka Candidatus-Tremblaya-princeps_PCIT.fa

Extract CDS rows from the GFF file

grep -P '\tCDS' PROKKA_05192020.gff > cds_PROKKA_05192020.gff

Examine the feature type and GFF attribute to be used as feature ID.

Generate GFF file using Prodigal

prodigal -f gff -o Candidatus-Tremblaya-princeps_PCIT.prodigal.gff -i Candidatus-Tremblaya-princeps_PCIT.fa -a Candidatus-Tremblaya-princeps_PCIT.fa-proteins.faa -d Candidatus-Tremblaya-princeps_PCIT.fa-proteins.ffn -p meta

Run htseq-count

htseq-count Candidatus-Tremblaya-princeps_PCIT.sam PROKKA_05192020/cds_PROKKA_05192020.gff -t CDS -i ID -c Candidatus-Tremblaya-princeps_PCIT.reverse.counts --nonunique none

Run htseq-count, only counting the reads mapping to the reverse strand

htseq-count Candidatus-Tremblaya-princeps_PCIT.sam PROKKA_05192020/cds_PROKKA_05192020.gff -t CDS -i ID -c Candidatus-Tremblaya-princeps_PCIT.reverse.counts --nonunique none -s reverse

Inspect and compare the ouputs

run counts-to-tpm script to convert read counts to normalized value: transcripts per million (TPM)

RNA seq and genome data for this tutorial was obtained from the following publication:

https://www.cell.com/cell/pdfExtended/S0092-8674(13)00646-6

Husnik et al...McCutcheon. (2013) "Horizontal Gene Transfer from Diverse Bacteria to an Insect Genome Enables a Tripartite Nested Mealybug Symbiosis" Cell. https://doi.org/10.1016/j.cell.2013.05.040

About

Lesson and tutorial on generating count tables from read-mapping data


Languages

Language:Shell 61.9%Language:Python 38.1%