1000 tools paper
This repository contains code and analysis for the "Over 1000 tools reveal trends in the single-cell RNA-seq analysis landscape" publication.
Directory structure
R/
- Folder containing R code files. See documentation in files for more detail.output/
- Output files created by the workflow. These versions are stored here as a record but will be overwritten by running the workflow.data-tables/
- Data files created by the workflowfigures/
- Figures shown in the papersupplementary/
- Supplementary figures shown in the papertables/
- Tables shown in the paper
.gitignore
- Git configuration file1000-tools.Rproj
- RStudio project fileLICENSE.md
- License fileREADME.md
- This README filerenv/
- Internal {renv} filesrenv.lock
- {renv} lock file specifying R package dependencies
Other files created by set up or the workflow
.Renviron
- Local R configuration file, created as part of setting up_cache/
- Cache files created by the workflow to speed up some parts of the analysis_targets/
- Internal {targets} files
Setting up
R Dependencies
R package dependencies are managed using {renv}. They should be automatically installed when you start an R session inside the repository but to make sure run:
renv::restore()
Crossref
Information about publications and preprints is retrieved from the Crossref API using the {rcrossref} package.
As explained in ?rcrossref::`rcrossref-package`
Crossref provides faster access to people who give an email address.
To do this add the following line to your .Renviron
:
crossref_email=your@email.com
johnnydep
Package dependencies for PyPI tools are retrieved using the johnnydep tool (https://pypi.org/project/johnnydep/). For these stages to work you must have johnnydep installed. The easiest way to do that is using pip or conda:
pip install johnnydep
# OR
conda install johnnydep
Once johnnydep is installed find the path to it using which
and set a JOHNNYDEP_PATH
variable in your .Renviron
.
which johnnydep
JOHNNYDEP_PATH=/path/to/your/johnnydep
Fonts
To make sure fonts used in plots are available follow these steps:
-
Download and install the Noto Sans and Noto Sans Maths fonts
- For MacOS users the easiest way to do this is using Homebrew:
brew install font-noto-sans font-noto-sans-math
- Noto Sans is also available from Google Fonts
-
Import fonts into R by running
extrafont::font_import()
If these fonts are not available the plots will still be produced they will just use the standard default font.
Analytics
This workflow can also generate analysis of usage of the scRNA-tools website but it requires access to the Google Analytics group so will need to be switched off for most people
No analytics access (most people)
Edit the _targets.R
file and make sure the include_analytics
variable is set to FALSE
.
include_analytics <- FALSE
Analytics access
Data for plots showing usage statistics of the scRNA-tools website are collected using the {googleAnalyticsR} package.
For this to work you must set up authentication with the googleAnalyticsR::ga_auth_setup()
function following the instructions here https://code.markedmondson.me/googleAnalyticsR/articles/setup.html.
At the end of the process your .Renviron
file should contain lines similar to these:
GAR_CLIENT_JSON=/path/to/oauth.json
GARGLE_EMAIL=your@email.com
Running analysis
The analysis workflow is managed using {targets}. Once set up is complete you can run the workflow using:
targets::tar_make()
Some of the steps (collecting reference and GitHub repository information) take a while to run.
Once the workflow is complete various output files will be created in the output/
directory.
If you want to view any of the intermediate parts of the workflow you can load the output of any target using:
targets::tar_load(target_name)
Updating analysis
The analysis is pinned to a particular date and version of the scRNA-tools database.
If you want to repeat the analysis for a more recent version edit _targets.R
and
modify the date
target:
tar_target(
date,
"YYYY-MM-DD"
)
License
The code is available under the MIT license.
Citation
If you use any of the code in this repository or analysis in the publication please cite:
Zappia L, Theis FJ. "Over 1000 tools reveal trends in the single-cell RNA-seq analysis landscape", Genome Biology (2021), DOI: 10.1186/s13059-021-02519-4
@ARTICLE{Zappia2021-bc,
title = "Over 1000 tools reveal trends in the single-cell {RNA-seq}
analysis landscape",
author = "Zappia, Luke and Theis, Fabian J",
journal = "Genome Biol.",
volume = 22,
number = 1,
pages = "301",
month = oct,
year = 2021,
language = "en",
doi = "10.1186/s13059-021-02519-4",
url = "https://doi.org/10.1186/s13059-021-02519-4"
}