royfrancis / witc-rnaseq

Workshop in the cloud

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to setup a virtual workshop in the cloud

Prerequisites

Docker

Download and install the docker community edition from the docker website.

Python prerequisites

You need python3 installed.

I recommend that you use virtualenv for installation of python dependencies (this is optional though):

virtualenv venv
source venv/bin/activate

To install the python libraries simply run

pip install -r requirements.txt

Set the SNIC Science Cloud API Password

Go to the cloud portal and in the left hand menu there's a link to Set your API password. Go there and do that.

Get credentials file from the openstack instance

Go to the webinterface of the openstack project, go to Compute -> Access & Security in the sidebar and click the API Access tab and then Download OpenStack RC File v3. Save that file as credentials.sh or something similar.

Then you have to source this file and enter the API password:

source credentials.sh

Setup

To create the configuration for the system run create_course.py. This will create the configuration files for the cloud and also generate a file, passwords.txt with usernames and passowords for the students.

For example it can be run something like this:

./create_course.py --users 3 --course-name biostatistics --cluster-prefix workshop --student-flavor ssc.small --master-disk-size 100 --shared-dir /data

This will create a course instance and spin up three student virtual machines. Also, a NFS server with capacity of 100Gb is setup in the master node and mounted onto the student machines under /data. You can find information about virtual machine flavors in the SNIC website.

These are all the configuration settings (can also be viewed with the --help switch):

usage: create_course.py [-h] --users USERS --course-name NAME [--cluster-prefix <virt-workshop>]
						[--master-flavor <ssc.small>] [--master-disk-size <0>]
						[--student-flavor <ssc.small>]
						[--student-disk-size <10>] [--shared-dir <shared-dir>]        

optional arguments:
  -h, --help            show this help message and exit
  --cluster-prefix <virt-workshop>
						Cluster prefix for hostnames in openstack, default is
						virt-workshop
  --master-flavor <ssc.small>
						The openstack flavor for the master node, default is
						ssc.small
  --master-disk-size <0>
						The disk size for the extra disk of the master node,
						in Gb, default is 0
  --student-flavor <ssc.small>
						The openstack flavor for the student nodes, default is
						ssc.small
  --student-disk-size <10>
						The disk size for the extra disk of the student nodes,
						in Gb, default is 10
  --shared-dir <shared-dir>
						Directory that should be shared from the master node
						to the compute nodes. For example: "--shared-dir /data"

Installing scientific software with Conda

Under <course-name>/playbooks there is a file called environment.yml where it is possible to define what Conda packages to install in the Conda environment. See an example below:

channels:
  - bioconda
  - conda-forge
  - defaults
dependencies:
  - star=2.5.4a
  - bwa=0.7.15

You can find the available bioconda packages in the Bioconda archive. For more information on how to manage Conda environments consult the documentation

Launch the system

The course configuration will be located under a new folder with its corresponding name. Navigate to it and launch the virtual infrastructure.

NOTE: Before launching the system, you might need to add your private key to the key chain.

cd <course-name>
chmod 400 ssh_key && ssh-add ssh_key
./kn apply

TIP: In order to wipe out the virtual infrastructure, you can use:

./kn destroy

Data upload

To make your data available in the system use the upload_data.py script. This will upload your local data to a NFS server.

./upload_data.py --local-dir ./dir1 --local-dir /opt/dir2 --remote-dir /data

These are all the configuration settings (can also be viewed with the --help switch):

usage: upload_data.py [-h] --local-dir DIRS --remote-dir DIR

 arguments:
  -h, --help            show this help message and exit
  --local-dir <local-dir>
					Local directory to upload to NFS. Can be repeated. For example: "--local-dir ./dir1 --local-dir /opt/dir2"
  --remote-dir <remote-dir>
					Remote directory. For example: "--remote_dir /data"

Student login

The credentials for accessing the virtual machines can be found in the passwords.txt file. Students should be able to access their instance via ssh by using their username and password. The master node IP address can be found in the inventory of the course folder.

ssh <student-name>@<master-node-ip>

Administrator login

Teachers should be able to access the master node via ssh with private key authentication.

ssh ubuntu@<master-node-ip>

About

Workshop in the cloud


Languages

Language:Shell 57.4%Language:HCL 29.4%Language:Python 13.2%