zjuwhw / Pipeline

This is a repository to store the pipeline to analysis microarray, ChIP-seq and RNA-seq

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Pipeline

This is a repository to store the pipeline to analysis affy microarray, single-end ChIP-seq and pair-end RNA-seq in Wang lab.

single-end ChIP-seq analysis:

####required tools: *alignment: bwa, bowtie2

*peak calling: MACS, hpeak

*motif discovery: meme-chip, homer, amd

*plot: VennDiagram, ggplot2 in R software

####pipeline: *chipseq_pipeline.py --- alignment using the sra or fastq files, and output bam file with only the unique mapped and non duplicated reads

*chipseq_peakcalling.py --- peak calling using unique mapped and non duplicated reads. Two tools are avaliable, MACS14 and Hpeak.

*chipseq_peak_overlap_venn.py --- cacluate the overlap number of different ChIP-seq peaks and then do the Venn plot

*chipseq_peak2gene.py --- find the target gene of a ChIPed enrichment region (peak). Three types to associate the peak with gene are avaliable, that is "peak2gene", "gene2peak", "peakAroundgene"

*chipseq_extract_signal_from_bigwig.py for calculate the signal value for a genome region and then to do aggregation plot or heatmap plot

*chipseq_motif_discovery.py --- motif discovery using meme-chip, homer or amd

Microarray analysis:

####required tools: *R bioconductor package affy for "hgu133a","hgu133a2","hgu133b","hgu133plus2","hgu219","hgu95a","hgu95av2","hgu95b","hgu95c","hgu95d","hgu95e","u133aaofav2"

*R bioconductor package oligo for "huex10st","hugene10st","hugene11st","hugene20st","hugene21st"

####pipeline: *affy_build_annotation.py --- download annotation files for affy microarray probe id, using the R bioconductor annotation db package

*affy_ExonOrGene_build_annotation.py --- download annotation files for affy Exon Or Gene microarray transcriptcluster id, using the R bioconductor annotation db package

*affy_array_pipeline.py --- affy microarray pipeline using R to do rma, mas5.0 and/or not customCDF normalization

*affy_array_pipeline.py --- affy microarray pipeline using R to do rma, mas5.0 and/or not customCDF normalization

###pair-end RNA-seq analysis: ####required tools: *fastq QC: fastqc

*trimming: trimmomatic

*alignment: STAR tophat

*bam QC: SAMSTAT, RSeQC, RNA-SeQC, picard-RNASeqMetrics

###other common tools: sratoolkit, bedtools, samtools, UCSC Jim Kent utility, picard

About

This is a repository to store the pipeline to analysis microarray, ChIP-seq and RNA-seq


Languages

Language:Python 97.5%Language:Shell 2.5%