nsteinau / rChIPSeqTools

This is the package of R scripts using for ChIP-Seq data analysis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

rChIPSeqTools

R scripts in this package were developed to analyse ChIP-Seq data generated by Stadhouders et al. 2015, Nature communications. There are 2 main functions for these scripts (1) to do peak detection and (2) to generate motif logos from the result of the motif discovery software, MEME (http://meme-suite.org/tools/meme).

To do peak detection, the script “sRunPeakDetection.R” has to be executed in the terminal of R statistical software. The script requires BAM files, which can be found in the "example.data" folder as the input. The script automatically reads in BAM files, detects and splits large regions, and assigns statistical p-values for detected binding sites. The peak detection will be performed chromosome-by-chromosome basis.

Here is the explanation of column names found in the peak detection result ("IRF2BP2.chrX.peaks.txt"):

  1. "chromosome" is the reference chromosome
  2. "start" is the start genomic position of the binding site
  3. "end" is the end genomic position of the binding site
  4. "peak_center" is the genomic position of the maximum coverage of that particular binding site
  5. "chip.Ntags" is the number of mapped reads in ChIP per each binding site
  6. "contr.Ntags" is the number of mapped reads in Control per each binding site
  7. "chip.max" is the number of maximum covereage in ChIP per each binding site
  8. "control.max" is the number of maximum covereage in Control per each binding site
  9. "padj.Ntags" is the adjusted p-value calculated from chip.Ntags Vs contr.Ntags
  10. "padj.maxCov" is the adjusted p-value calculated from chip.max Vs control.max
  11. "fold.Ntags" is the foldchange calculated from chip.Ntags Vs contr.Ntags
  12. "fold.max" is the foldchange calculated from chip.max Vs control.max

To generate motif logos, the script "sRunMotifGeneration.R" has to be executed in the terminal of R statistical software. This script requires MEME format as the input (please see the example of format in the "example.data/Demo.meme.result.txt"). The script will automatically generates motif logos in both forward and reverse complementary strands and will save the file in PDF format (see the file "example.data/Demo.meme.result.txt.logo.pdf" as the example output).

Please contact supat.thongjuea@gmail.com

About

This is the package of R scripts using for ChIP-Seq data analysis


Languages

Language:R 100.0%