Slide-seq tools
Tools for analyzing Slide-seq data including building genome reference, aligning reads to reference genome, generating feature-barcode matrices, performing gene expression analysis and matching data from in situ sequencing and indexing of barcodes with short read sequencing data.
Requirement
Several public tools need to be pre-installed:
Drop-seq tools
: https://github.com/broadinstitute/Drop-seqPicard
: https://broadinstitute.github.io/picard/STAR
: https://github.com/alexdobin/STARJava
Samtools
gcc/g++
Python
(prefer 3.6 or above)
Several Python packages need to be installed for calculation and ploting:
numpy
pandas
plotnine
matplotlib
Build genome reference
The Slide-seq tools need a genome reference in specific format for alignment and analysis. You could build a genome reference based on input fasta and gtf files.
Command:
build_reference.py manifest_file
Check example.buildreference.txt
for manifest file format
Run the Slide-seq tools
Add below commands into run.sh
and build_reference.sh
or your bashrc
file (command might be different on your system):
use Java-1.8
use .samtools-1.7
use Python-3.6
Compile CMatcher (command might be different on your system):
g++ -std=c++11 -o cmatcher cmatcher.cpp
Submit a request to the Slide-seq tools:
python submit_job.py manifest_file
Notice:
- Check
example.manifest.txt
for manifest file format - An email from slideseq@gmail.com will be sent to you if email_address is specified in the manifest file when the submission is received, the workflow finishes, and/or any job fails.
- In order to speed up the process of NovaSeq data and NovaSeq S4 data, the Slide-seq tools split each lane into a few slices, run the alignment steps on the slices parallelly and combine the alignment outputs together.
- See
user_doc.txt
for detailed usage of the Slide-seq tools.