theJasonFan / shoal

Improved multi-sample transcript abundance estimates using adaptive priors

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

shoal

Improved multi-sample transcript abundance estimates using adaptive priors

A shoal1

What is shoal?

shoal is a tool which jointly quantify transcript abundances across multiple samples. Specifically, shoal learns an empirical prior on transcript-level abundances across all of the samples in an experiment, and subsequently applies a variant of the variational Bayesian expectation maximization algorithm to apply this prior adaptively across multi-mapping groups of reads.

shoal can increase quantification accuracy, inter-sample consistency, and reduce false positives in downstream differential analysis when applied to multi-condition RNA-seq experiments. Moreover, shoal, runs downstream of Salmon and requires less than a minute per-sample to re-estimate transcript abundances while accounting for the learned empirical prior.

shoal is designed and developed by Avi Srivastava, Michael Love and Rob Patro.

Using shoal

Shoal requires to have salmon output of all the samples in the experiment separately using the latest version of Salmon (either built from the develop branch of the Salmon repo; or, you can grab a pre-compiled binary for Linux from here). Please run Salmon with the --dumpEqWeights option, which will produce output suitable for shoal.

  • clone shoal into your local machine:
git clone https://github.com/COMBINE-lab/shoal.git
  • run shoal2:
./run_shoal.sh -q <salmon_quant_directory_path> -o <output_directory_path>

This script assumes that all of the Salmon quantification directories are subdirectories of the path that you provide via the -q option. So, e.g., if you have an experiment with six samples across 2 conditions (say, A{1,2,3} and B{1,2,3}), then the shoal script would expect a layout like:

exp_quants
  |
  |--- A1
     |
     |--- quant.sf
  |--- A2
    |
    |--- quant.sf
  |--- A3
    |
    |--- quant.sf
  |--- B1
    |
    |--- quant.sf
  |--- B2
    |
    |--- quant.sf
  |--- B3
    |
    |--- quant.sf

the script would then be invoked by passing -q exp_quants to provide the top-level quantification directory for the entire experiment. Specifically, a command like ./run_shoal -q exp_quants -o exp_shoal_quants would produce a modified (Salmon-format) quantification file for each of the samples ({A,B}{1,2,3}) in the directory exp_shoal_quants as described below (the script will create the output directory if it does not already exist).

  • shoal output:
    -- shoal generates .sf files for each sample in the experiment with naming convention as follows:
<output_directoty>/<sample_name>_adapt.sf

Compilation Error

readlink: illegal option -- f
usage: readlink [-n] [file ...]

install coreutils for greadlink command

brew install coreutils

Footnotes:

1 This image is from the wikipedia artical on shoaling. It is licensed under CC-BY-SA.

2 shell script can be given executable permission with command: chmod +x run_shoal.sh

About

Improved multi-sample transcript abundance estimates using adaptive priors

License:BSD 2-Clause "Simplified" License


Languages

Language:C++ 98.3%Language:C 1.1%Language:Python 0.5%Language:Shell 0.1%Language:Makefile 0.0%