Processing scRNA-Seq data from the following paper:
Kinker, G.S., Greenwald, A.C., Tal, R. et al. Pan-cancer single-cell RNA-seq identifies recurring programs of cellular heterogeneity. Nat Genet 52, 1208–1218 (2020).
The layout of this workflow follows this template by Johannes Köster. Given a SRA project ID, the pipeline downloads BAM files, converts them to FASTQ files using bamtofastq
, and uses Cell Ranger to align reads and generate expression matrices.
git clone https://github.com/y1zhou/GSE157220
Configure the workflow according to your needs via editing the file config.yaml
. Afterwards, install all required packages and generate the design matrix:
# Prepare environment
conda env create -f=environment.yaml
conda activate scrna-cellranger
# Generate sample matrix
snakemake --cores 1 generate_srr_id
Test your configuration by performing a dry-run via
snakemake -np --reason
Execute the workflow locally via
snakemake --cores <num-cores> --reason
See the Snakemake documentation for further details.