jfallmann / galaxy-rna-workbench

Galaxy RNA workbench (de.NBI project)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Manuscript Build Status Docker Repository on Quay

RNA Galaxy Workbench

The RNA Galaxy workbench is a comprehensive set of analysis tools and consolidated workflows. The workbench is based on the Galaxy framework, which guarantees simple access, easy extension, flexible adaption to personal and security needs, and sophisticated analyses independent of command-line knowledge.

The current implementation comprises more than 50 bioinformatics tools dedicated to different research areas of RNA biology, including RNA structure analysis, RNA alignment, RNA annotation, RNA-protein interaction, ribosome profiling, RNA-Seq analysis, and RNA target prediction.

The workbench is developed by the RNA Bioinformatics Center (RBC). This center is one of the eight service units of the German Network for Bioinformatics Infrastructure, running the German ELIXIR Node.

de.NBI ELIXIR Germany

Usage

The RNA analyses workbench implements a webserver based on the Galaxy Docker platform: a dedicated Galaxy instance wrapped in a Docker container. For advanced local deployments, we recommend to check out the upstream documentation.

▲ back to top

Requirement

To use the Galaxy RNA workbench, you only need Docker, which can be installed in different ways, depending on the type of system you're running:

  • non-linux users are encouraged to use Kitematic, which provides a Docker installation for OSX or Windows, coupled with a user friendly interface to run Docker containers;
  • linux users and people familiar with the command line can follow the instruction on installing Docker from its website.

▲ back to top

Docker configuration

The RNA workbench docker container is rather large and expected to grow when further tools and workflows are contributed. So for users new to docker, we list here some tweaks that can help to work around issues when first using docker. After successful installation of docker, it is recommended to configure some settings, dealing for example with the storage space required by containers. You can find more information here.

▲ back to top

RNA workbench launch

Whether you run Docker images using Kitematic or the command line interface, the procedure to launch the RNA workbench varies.

Using Kitematic

Kitematic users can launch the RNA workbench directly from its interface. The following video shows how to load the docker container that is necessary to use the workbench:

Galaxy RNA workbench launch through Kitematic

▲ back to top

Without Kitematic

For non-Kitematic users, starting the RNA workbench is analogous to start the generic Galaxy Docker image:

$ docker run -d -p 8080:80 quay.io/bgruening/galaxy-rna-workbench

A detailed discussion of Docker's parameters is given in the Docker manual. It is really worth reading. Nevertheless, here is a quick rundown:

  • docker run starts the Image/Container

    In case the Container is not already stored locally, docker downloads it automatically

  • The argument -p 8080:80 makes the port 80 (inside of the container) available on port 8080 on your host

    Inside the container a Apache web server is running on port 80 and that port can be bound to a local port on your host computer. With this parameter you can access your Galaxy instance via http://localhost:8080 immediately after executing the command above

  • quay.io/bgruening/galaxy-rna-workbench is the Image/Container name, that directs docker to the correct path in the docker index

  • -d will start the docker container in Daemon mode.

    For an interactive session, one executes:

    $ docker run -i -t -p 8080:80 quay.io/bgruening/galaxy-rna-workbench /bin/bash
    

    and manually invokes the startup script to start PostgreSQL, Apache and Galaxy.

Docker images are "read-only". All changes during one session are lost after restart. This mode is useful to present Galaxy to your colleagues or to run workshops with it.

To install Tool Shed repositories or to save your data, you need to export the calculated data to the host computer. Fortunately, this is as easy as:

$ docker run -d -p 8080:80 -v /home/user/galaxy_storage/:/export/ quay.io/bgruening/galaxy-rna-workbench

Given the additional -v /home/user/galaxy_storage/:/export/ parameter, docker will mount the folder /home/user/galaxy_storage into the Container under /export/. A startup.sh script, that is usually starting Apache, PostgreSQL and Galaxy, will recognize the export directory with one of the following outcomes:

  • In case of an empty /export/ directory, it will move the PostgreSQL database, the Galaxy database directory, Shed Tools and Tool Dependencies and various configure scripts to /export/ and symlink back to the original location.
  • In case of a non-empty /export/, for example if you continue a previous session within the same folder, nothing will be moved, but the symlinks will be created.

This enables you to have different export folders for different sessions - meaning real separation of your different projects.

It will start the Galaxy RNA workbench with the configuration and launch of a Galaxy instance and its population with the needed tools. The instance will be accessible at http://localhost:8080.

For a more specific configuration, you can have a look at the documentation of the Galaxy Docker Image.

▲ back to top

Users and passwords

The Galaxy Admin User has the username admin@galaxy.org and the password admin. In order to use certain features of Galaxy, like e.g. the RNA structure visualization, one has to be logged in. Also the installation of additional tools requires a login.

The PostgreSQL username is galaxy, the password galaxy and the database name galaxy.

If you want to create new users, please make sure to use the /export/ volume. Otherwise your user will be removed after your docker session is finished.

▲ back to top

Tours

The RNA workbench provides the possibility to run interactive tours that illustrate how the main interface works in relation to real-life user tasks. These show many common operations, such as searching, parametrizing, and running tools, or saving a history of operations in a sharable workflow.

The following video demonstrates the main elements that compose the Galaxy user interface:

Galaxy RNA workbench UI tour

▲ back to top

Available tools

In this section we list all tools that have been integrated in the RNA workbench. The list is likely to grow as soon as further tools and workflows are contributed. To ease readability, we divided them into categories.

▲ back to top

RNA structure prediction and analysis

Tool Description Reference
antaRNA Possibility of inverse RNA structure folding and a specification of a GC value constraint Kleinkauf et al. 2015
CoFold A thermodynamics-based RNA secondary structure folding algorithm Proctor and Meyer, 2015
Kinwalker Algorithm for cotranscriptional folding of RNAs to obtain the min. free energy structure Geis et al. 2008
MEA Prediction of maximum expected accuracy RNA secondary structures Amman et al. 2013
RNAshapes Structures to a tree-like domain of shapes, retaining adjacency and nesting of structural features Janssen and Giergerich, 2014
RNAz Predicts structurally conserved and therm. stable RNA secondary structures in mult. seq. alignments Washietl et al. 2005
segmentation-fold An application that predicts RNA 2D-structure with an extended version of the Zuker algorithm -
ViennaRNA A tool compilation for prediction and comparison of RNA secondary structures Lorenz et al. 2011

▲ back to top

RNA alignment

Tool Description Reference
Compalignp An RNA counterpart of the protein specific "Benchmark Alignment Database" Wilm et al. 2006
LocARNA A tool for multiple alignment of RNA molecules Will et al. 2012
MAFFT A multiple sequence alignment program for unix-like operating systems Katoh and Standley, 2016
RNAlien A tool for RNA family model construction Eggenhofer et al. 2016
CMV RNA family model visualisation Eggenhofer et al. 2018

▲ back to top

RNA annotation

Tool Description Reference
ARAGORN A tool to identify tRNA and tmRNA genes Laslett and Canback, 2004
Fusion Matcher (FuMa) A tool that reports identical fusion genes based on gene-name annotations Hoogstrate et al. 2016
GotohScan A search tool that finds shorter sequences in large database sequences Hertel et al. 2009
INFERNAL A tool searching DNA sequence databases for RNA structure and sequence similarities Nawrocki et al. 2015
RNABOB A tool for fast pattern searching for RNA secondary structures -
RNAcode Predicts protein coding regions in a a set of homologous nucleotide sequences Washietl et al. 2011
RNAmmer Predicts 5s/8s, 16s/18s, and 23s/28s ribosomal RNA in full genome sequences Lagesen et al. 2007
tRNAscan Searches for tRNA genes in genomic sequences Lowe and Eddy, 1997
RCAS A generic reporting tool for the functional analysis of transcriptome-wide regions of interest detected by high-throughput experiments Uyar et al.

▲ back to top

RNA-protein interaction

Tool Description Reference
AREsite2 A database for AU-/GU-/U-rich elements in human and model organisms Fallmann et al. 2016
DoRiNA A database of RNA interactions in post-transcriptional regulation Blin et al. 2014
PARalyzer An algorithm to generate a map of interacting RNA-binding proteins and their targets Corcoran et al. 2011
Piranha A peak-caller for CLIP- and RIP-seq data -

▲ back to top

RNA target prediction

Tool Description Reference
TargetFinder A tool to predict small RNA binding sites on target transcripts from a sequence database -

▲ back to top

RNA Seq and HTS analysis

Preprocessing

Tool Description Reference
FastQC! A quality control tool for high throughput sequence data -
Trim Galore! Automatic quality and adapter trimming as well as quality control -

▲ back to top

RNA-Seq

Tool Description Reference
BlockClust Small non-coding RNA clustering from deep sequencing read profiles Videm et al. 2014
FlaiMapper A tool for computational annotation of small ncRNA-derived fragments using RNA-seq data Hoogstrate et al. 2015
MiRDeep2 Discovers microRNA genes by analyzing sequenced RNAs Friedländer et al. 2008
NASTIseq A method that incorporates the inherent variable efficiency of generating perfectly strand-specific libraries Li et al. 2013
PIPmiR An algorithm to identify novel plant miRNA genes from a combination of deep sequencing data and genomic features Breakfield et al. 2011
SortMeRNA A tool for filtering, mapping and OTU-picking NGS reads in metatranscriptomic and -genomic data Kopylova et al. 2011

▲ back to top

Read Mapping

Tool Description Reference
HISAT2 Hierarchical indexing for spliced alignment of transcripts Pertea et al. 2016
STAR Rapid spliced aligner for RNA-seq data Dobin et al. 2013
STAR-fusion Fast fusion gene finder Haas et al. 2017
Bowtie 2 Fast and sensitive read alignment Langmead et al. 2012
BWA Software package for mapping low-divergent sequences against a large reference genome Li and Durbin 2009, Li and Durbin 2010

▲ back to top

Transcript Assembly

Tool Description Reference
Trinity De novo transcript sequence reconstruction from RNA-Seq Haas et al. 2013

▲ back to top

Quantification

Tool Description Reference
featureCounts Ultrafast and accurate read summarization program Liao et al. 2014
htseq-count Tool for counting reads in features Anders et al. 2015
Sailfish Rapid Alignment-free Quantification of Isoform Abundance Patro et al. 2014
Salmon Fast, accurate and bias-aware transcript quantification Patro et al. 2017

▲ back to top

Differential expression analysis

Tool Description Reference
DESeq2 Differential gene expression analysis based on the negative binomial distribution Love et al. 2014

▲ back to top

Utilities

Tool Description Reference
SAMtools Utilities for manipulating alignments in the SAM format Heng et al. 2009
BEDTools Utilities for genome arithmetic Quinlan and Hall 2010
deepTools Tools for exploring deep-sequencing data Ramirez et al. 2014, Ramirez et al. 2016

▲ back to top

Ribosome profiling

Tool Description Reference
RiboTaper An analysis pipeline for Ribo-Seq experiments, exploiting the triplet periodicity of ribosomal footprints to call translated regions Calviello et al. 2016

▲ back to top

Training

To learn about RNA sequencing data analysis, we recommend you to have a look at the training material from the Galaxy Training network, and particularly the tutorial about the Reference-based RNA-seq data analysis.

In the Galaxy RNA workbench, we also included Galaxy interactive tours to guide you through the Galaxy, it's tools and possibilities.

▲ back to top

Contributors

▲ back to top

How to contribute

The RNA-workbench community welcomes new contributions and help in any way. We have collected detailed instructions and some guidance in our CONTRIBUTING.md.

Support and bug reports

For support, questions, or feature requests fill bug reports on our issue page.

▲ back to top

MIT license

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

▲ back to top

About

Galaxy RNA workbench (de.NBI project)

License:MIT License


Languages

Language:HTML 52.2%Language:CSS 22.2%Language:Dockerfile 13.8%Language:Shell 6.3%Language:JavaScript 5.6%