Code and Data for
The evolution of white-tailed jackrabbit camouflage in response to past and future seasonal climates
This GitHUb repository contains code and some data, needed to recreate analyses from:
XXX CITATION FOR PAPER HERE.
The top level of this repository contains code and data needed to recreate the species distribution modelling (SDM), phenotypic modelling, and evolutionary simulations. This repository also contains a folder, genomics/
, containing code to run the additional genomic analyses performed on all 143 samples across the range. See the README within that folder for further information on those analyses. The remainder of this README focuses on the other analyses.
We have used a set of software tools to try to make this analysis pipeline as reproducible as possible. We used the Python program Snakemake
to create a reproducible analysis pipeline. Most analyses are performed in R
, and we used the renv
package to manage libraries/packages for R
.
You will need the following software installed on your system:
- Python 3. We recommend installing Python 3 through Anaconda or Miniconda. We used Python version 3.8.3.
- Snakemake. We recommend following installing
snakemake
throughconda
, see these instructions. We used version 5.14.0. - R. R can be downloaded from the R Project for Statisical Computing. We used R version
4.0.2
. - The R package
renv
. See here for instructions on installingrenv
. We used version0.12.0
. - R package dependencies. These dependencies are managed with the
renv
package. To install the dependencies, open this project in RStudio (easiest) or, in your terminal, set this project as your working directory and then launchR
. Then, callrenv::restore()
to install the necessary packages (this may take a while). - Pandoc, for compiling
.Rmd
files into.html
files. We recommend installingpandoc
throughconda
, see these instructions. We usedpandoc
version 2.10.1. - SLiM, a program for forward genetic simulations See the. See the SLiM website for information on installtion. We used version 3.6.
This repository is set up as an RStudio project. You can open it in RStudio by opening the wtjr_sdm.Rproj
file. The project is organized into a few top-level folders:
processed_data
- Data that has been processed/curated and is ready for analysis. All these files are created from raw data files (in theraw_data
folder) and analysis scripts found insrc/
.raw_data
- Raw data obtained for this project, without any processing. See the README file in that folder for more details on the data source for each item. NB- Many raw data files are included in this repository, but some raw data files are too large to be uploaded to Github. See the README file in the raw data folder for information on how to obtain those files, either from Dryad or from the original sources we used.renv
- package/library management for this project. This folder is created/maintained by therenv
package, users should not need to edit it.results
- Analysis results, generated from raw data, processed data, and scripts.sp_wtjr
- A folder containing an example snakemake profile for use on an HPC cluster using the SLURM job scheduler.src
- Scripts and functions written for this project. Some scripts are in.Rmd
files, which generate corresponding.html
files in thedocs
folder upon knitting.
Other important files:
Snakefile
- The main snakemake workflow for the SDM, phenotypic modelling, and evolutionary simulation analysis.slim_simulations.smk
- A snakemake workflow (a sub-workflow of the main snakefile) used to run and compile the results of the evolutionary simulations performed in SLiM.
Some raw data files are too large to be uploaded to Github. They must be obtained through other sources and then copied into the appropriate folder before running the pipeline. See the README file in the raw_data/
folder for instructions on how to obtain these files.
Once all the necessary software is installed and you have obtained the necessary data, you can use snakemake
to run our analysis pipeline. First, set this project directory as your working directory, and then execute:
snakemake
This will run snakemake
, but it may not run the full pipeline: some of the intermediate files and results are already in this repository and may not need to be re-generated. If you would like to force the pipeline to re-run from scratch, run:
snakemake -F
This should run the full pipeline, though note that it will likely take a long time. Also, you will likely get slightly different results, maps, and plots from our published analysis. This is because the model-fitting processes are not deterministic, and there will be a small amount of numerical variance upon re-running.
Finally, the pipeline may fail if you attempt to run it on a personal computer with insufficient processing power, memory, and storage. We ran our analysis on a HPC cluster, and recommend you do the same. See the snakemake
documentation for guidance on running snakemake
on a cluster and how to set up a profile for specifying HPC options. A sample command for running our pipeline on a cluster, using the snakemake
profile found in a folder named sp_wtjr
, is:
snakemake --profile sp_wtjr/
but you will need to create your own profile adapted to your own HPC.