GeneticResources / bcbioRNASeq

Quality control and differential expression for bcbio RNA-seq experiments.

Home Page:http://bioinformatics.sph.harvard.edu/bcbioRNASeq

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

bcbioRNASeq

Travis CI AppVeyor CI Codecov Project Status: Active - The project has reached a stable, usable state and is being actively developed. Anaconda-Server Badge

Quality control and differential expression for bcbio RNA-seq experiments.

Installation

This is an R package.

source("https://bioconductor.org/biocLite.R")
biocLite("devtools")
biocLite("GenomeInfoDbData")
biocLite(
    "hbc/bcbioRNASeq",
    dependencies = c("Depends", "Imports", "Suggests")
)

conda method

conda install -c bioconda r-bcbiornaseq

Load bcbio run

library(bcbioRNASeq)
bcb <- loadRNASeq(
    uploadDir = "bcbio_rnaseq_run/final",
    interestingGroups = c("genotype", "treatment"),
    organism = "Homo sapiens"
)
# Back up all data inside bcbioRNASeq object
flatFiles <- flatFiles(bcb)
saveData(bcb, flatFiles)

This will return a bcbioRNASeq object, which is an extension of the Bioconductor RangedSummarizedExperiment container class.

Parameters:

  • uploadDir: Path to the bcbio final upload directory.
  • interestingGroups: Character vector of the column names of interest in the sample metadata, which is stored in the colData() accessor slot of the bcbioRNASeq object. These values should be formatted in camelCase, and can be reassigned in the object after creation (e.g. interestingGroups(bcb) <- c("batch", "age")). They are used for data visualization in the quality control utility functions.
  • organism: Organism name. Use the full latin name (e.g. "Homo sapiens").

Consult help("loadRNASeq", "bcbioRNASeq") for additional documentation.

R Markdown templates

This package provides multiple R Markdown templates, including Quality Control and Differential Expression using DESeq2, which are available in RStudio at File -> New File -> R Markdown... -> From Template.

Examples

View example HTML reports rendered from the default R Markdown templates included in the package:

Sample metadata

For a normal bcbio RNA-seq run, the sample metadata will be imported automatically using the project-summary.yaml file in the final upload directory. If you notice any typos in your metadata after completing the run, these can be corrected in the YAML file. Alternatively, you can pass in a sample metadata file into loadRNASeq() using the sampleMetadataFile parameter.

Minimal example

The sample IDs in the bcbioRNASeq object map to the description column, which gets sanitized internally into a sampleID column. The sample names provided in the description column must be unique.

fileName description genotype
sample_1_R1.fastq.gz sample_1 wildtype
sample_2_R1.fastq.gz sample_2 knockout
sample_3_R1.fastq.gz sample_3 wildtype
sample_4_R1.fastq.gz sample_4 knockout

Technical replicates

Use sampleNameAggregate to assign groupings for technical replicates:

fileName description sampleNameAggregate
wildtype_L001_R1.fastq.gz wildtype_L001 wildtype
wildtype_L002_R1.fastq.gz wildtype_L002 wildtype
wildtype_L003_R1.fastq.gz wildtype_L003 wildtype
wildtype_L004_R1.fastq.gz wildtype_L004 wildtype
mutant_L001_R1.fastq.gz mutant_L001 mutant
mutant_L002_R1.fastq.gz mutant_L002 mutant
mutant_L003_R1.fastq.gz mutant_L003 mutant
mutant_L004_R1.fastq.gz mutant_L004 mutant

Citation

citation("bcbioRNASeq")

Steinbaugh MJ, Pantano L, Kirchner RD, Barrera V, Chapman BA, Piper ME, Mistry M, Khetani RS, Rutherford KD, Hoffman O, Hutchinson JN, Ho Sui SJ. (2017). bcbioRNASeq: R package for bcbio RNA-seq analysis. F1000Research 6:1976.

References

The papers and software cited in our workflows are available as a shared library on Paperpile.

About

Quality control and differential expression for bcbio RNA-seq experiments.

http://bioinformatics.sph.harvard.edu/bcbioRNASeq

License:MIT License


Languages

Language:R 81.5%Language:TeX 18.2%Language:Shell 0.2%Language:CSS 0.0%