correct installation of workflow with R
ceesu opened this issue · comments
Hello, thanks so much for making this repository.
I started out with following the installation steps linked in the readme, but noticed that if I follow steps 1-3 there, I get a Conda environment that does not contain any distribution of R.
I see r packages in the workflow/envs directory here on GitHub and was wondering, what is the way to do the installation such that the environment files such as biomart.yaml, deseq2.yaml etc. are set up properly?
Hi @ceesu,
the main conda environment that you use to run the snakemake workflow only needs to contain snakemake
(and snakedeploy
) and all of its dependencies, so this is what step 1 creates.
When run, snakemake will create separate environments for each rule, based on the workflow/envs/*.yaml
files (or those specified in the snakemake wrappers used) and will take care to activate and deactivate them when running the respective rule as a job.
This environment creation with conda will only run once, when you first start your workflow. For future invocations of the workflow, these environments will already exist and will just be activated whenever necessary.
So, you really shouldn't need to do anything but the recommended installation steps, including setting up the config/config.yaml
and config/*.tsv
files as described. Once this is done, all the rest should work automatically.
Hope that helps clear things up. Let us know if anything remains unclear.
Hi @dlaehnemann, thank you so much for replying. The reason I brought this up is that I am getting the following error very late in the pipeline which seems to be related to converting gene symbols, and then to DESeq2:
Error in rule gene_2_symbol:
jobid: 31
input: results/counts/all.tsv
output: results/counts/all.symbol.tsv
log: logs/gene2symbol/results/counts/all.log (check log file(s) for error details)
RuleException:
CalledProcessError in file https://raw.githubusercontent.com/snakemake-workflows/rna-seq-star-deseq2/v2.0.0/workflow/rules/diffexp.smk, line 32:
Command 'source /Users/XXX/mambaforge/bin/activate '/Users/XXX/bulk-rna-pipeline/.snakemake/conda/bdfec4f6ca99adf3b9bdfc5575ffaa38_'; set -euo pipefail; Rscript --vanilla /Users/XXX/.snakemake/scripts/tmphosyuxvq.gene2symbol.R' returned non-zero exit status 1.
File "https://raw.githubusercontent.com/snakemake-workflows/rna-seq-star-deseq2/v2.0.0/workflow/rules/diffexp.smk", line 32, in __rule_gene_2_symbol
File "/Users/XXX/mambaforge/envs/snakemake/lib/python3.11/concurrent/futures/thread.py", line 58, in run
Error in rule deseq2_init:
jobid: 3
input: results/counts/all.tsv
output: results/deseq2/all.rds, results/deseq2/normcounts.tsv
log: logs/deseq2/init.log (check log file(s) for error details)
conda-env: /Users/XXX/.snakemake/conda/8004cf7ad2dcdd3144e9519eaaeff699_
RuleException:
CalledProcessError in file https://raw.githubusercontent.com/snakemake-workflows/rna-seq-star-deseq2/v2.0.0/workflow/rules/diffexp.smk, line 47:
Command 'source /Users/XXX/mambaforge/bin/activate '/Users/cathysu/bulk-rna-pipeline/.snakemake/conda/8004cf7ad2dcdd3144e9519eaaeff699_'; set -euo pipefail; Rscript --vanilla /Users/XXX/.snakemake/scripts/tmp2p2r0qgy.deseq2-init.R' returned non-zero exit status 1.
File "https://raw.githubusercontent.com/snakemake-workflows/rna-seq-star-deseq2/v2.0.0/workflow/rules/diffexp.smk", line 47, in __rule_deseq2_init
File "/Users/XXX/mambaforge/envs/snakemake/lib/python3.11/concurrent/futures/thread.py", line 58, in run
I want to be able to check what's going on but for the first error, the log logs/gene2symbol/results/counts/all.log
is empty.
For the second error, the log contains the following:
Error: package or namespace load failed for ‘DESeq2’ in library.dynam(lib, package, package.lib):
shared object ‘BiocParallel.dylib’ not found
Is there a way I could find out what environments I should modify to fix this?