snakemake-workflows / rna-seq-star-deseq2

RNA-seq workflow using STAR and DESeq2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

read_distribution.py

Moonerss opened this issue · comments

In the qc.smk file, many python script are needed, but I can't find them in the folder.
such as read_distribution.py

rule rseqc_readdis:
    input:
        bam="results/star/{sample}-{unit}/Aligned.sortedByCoord.out.bam",
        bed="results/qc/rseqc/annotation.bed",
    output:
        "results/qc/rseqc/{sample}-{unit}.readdistribution.txt",
    priority: 1
    log:
        "logs/rseqc/rseqc_readdis/{sample}-{unit}.log",
    conda:
        "../envs/rseqc.yaml"
    shell:
        "read_distribution.py -r {input.bed} -i {input.bam} > {output} 2> {log}"

These scripts are part of the software/package rseqc, which is installed via the environment defined with the conda: "../envs/rseqc.yaml" directive. You can find the respective specification file here:
https://github.com/snakemake-workflows/rna-seq-star-deseq2/blob/b93ed3fcd195a21815406369af3a113d8f6e23e8/workflow/envs/rseqc.yaml

This installs rseqc from bioconda, and with the respective conda environment activated, you will have this script and a lot of others available in your path. For a list and documentation of all the scripts, see the rseqc documentation. For getting to know (bio-)conda, check out the bioconda documentation and its links to the conda documentation.