latchbio / quantaf

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Quantaf

Introduction

Many single-cell and single-nucleus datasets are publicly available. Some of these datasets are used over and over again to develop new methods, demonstrate new tools, to use in tutorials, etc. For example, 10x Genomics provided various publicly available datasets on their website for free downloading (https://www.10xgenomics.com/resources/datasets), and many of these are commonly used in single-cell tutorials and vignettes.

To stop re-inventing the wheel, here we introduce an alevin-fry workflow written in Nextflow that can be used to quantify arbitrary number of single-cell RNA-sequencing projects (https://github.com/COMBINE-lab/10x-requant) in one command with some required spreadsheets under the input_files directory as the input, which are:

  1. sample_sheet.tsv: This spreadsheet records the detailed information of the datasets one would like to process. Please refer to the provided sheet for examples.
  2. ref_sheet.tsv: This spreadsheet include the reference used to make the alevin-fry splici reference upon. Currently the provided reference must be the pre-build Cell Ranger references, for example, human2020A and mm10-2020A. All the references specified in the reference column of the sample_sheet.tsv must be included in the ref_sheet.tsv, otherwise the workflow will have no idea what reference should be used for mapping reads against.
  3. pl_sheet.tsv: This spreadsheet records the permit lists that will be used for deduplicating cellular barcode in alevin-fry, for example, the permit list used for 10x Chromium V2 and V3 chemistry. All the chemistry specified in the chemistry column of the sample_sheet.tsv must be included in the pl_sheet.tsv.

With the three required spreadsheets, this workflow will download the references, make the splici references and indices, and run the alevin-fry pipeline for each dataset in the sample_sheet.tsv.

The outputs of this workflow is under the nf_pipeline/alevin_fry folder, which includes the quantification result of all the processed datasets. The quantification folder is named by the MD5sum of the dataset for simplicity. One can find the name and URL of the dataset in the dataset_description.txt file in each quantification result folder.

processed 10x datasets:

Using this workflow we have collected and processed some datasets from 10x website. Here we provide the link of the quantification result generated by alevin-fry. For more information, please check the webpage of quantaf:

  1. 500 Human PBMCs, 3' LT v3.1, Chromium Controller: link to the quant result
  2. 500 Human PBMCs, 3' LT v3.1, Chromium X: link to the quant result
  3. 1k PBMCs from a Healthy Donor (v3 chemistry): link to the quant result
  4. 10k PBMCs from a Healthy Donor (v3 chemistry): link to the quant result
  5. 10k Human PBMCs, 3' v3.1, Chromium X: link to the quant result
  6. 10k Human PBMCs, 3' v3.1, Chromium Controller: link to the quant result
  7. 10k Peripheral blood mononuclear cells (PBMCs) from a healthy donor, Single Indexed: link to the quant result
  8. 10k Peripheral blood mononuclear cells (PBMCs) from a healthy donor, Dual Indexed: link to the quant result
  9. 20k Human PBMCs, 3' HT v3.1, Chromium X: link to the quant result
  10. PBMCs from EDTA-Treated Blood Collection Tubes Isolated via SepMate-Ficoll Gradient (3' v3.1 Chemistry): link to the quant result
  11. PBMCs from Heparin-Treated Blood Collection Tubes Isolated via SepMate-Ficoll Gradient (3' v3.1 Chemistry): link to the quant result
  12. PBMCs from ACD-A Treated Blood Collection Tubes Isolated via SepMate-Ficoll Gradient (3' v3.1 Chemistry): link to the quant result
  13. PBMCs from Citrate-Treated Blood Collection Tubes Isolated via SepMate-Ficoll Gradient (3' v3.1 Chemistry): link to the quant result
  14. PBMCs from Citrate-Treated Cell Preparation Tubes (3' v3.1 Chemistry): link to the quant result
  15. PBMCs from a Healthy Donor: Whole Transcriptome Analysis: link to the quant result
  16. Whole Blood RBC Lysis for PBMCs and Neutrophils, Granulocytes, 3': link to the quant result
  17. Peripheral blood mononuclear cells (PBMCs) from a healthy donor - Manual (channel 5): link to the quant result
  18. Peripheral blood mononuclear cells (PBMCs) from a healthy donor - Manual (channel 1): link to the quant result
  19. Peripheral blood mononuclear cells (PBMCs) from a healthy donor - Chromium Connect (channel 5): link to the quant result
  20. Peripheral blood mononuclear cells (PBMCs) from a healthy donor - Chromium Connect (channel 1): link to the quant result
  21. Hodgkin's Lymphoma, Dissociated Tumor: Whole Transcriptome Analysis: link to the quant result
  22. 200 Sorted Cells from Human Glioblastoma Multiforme, 3’ LT v3.1: link to the quant result
  23. 750 Sorted Cells from Human Invasive Ductal Carcinoma, 3’ LT v3.1: link to the quant result
  24. 2k Sorted Cells from Human Glioblastoma Multiforme, 3’ v3.1: link to the quant result
  25. 7.5k Sorted Cells from Human Invasive Ductal Carcinoma, 3’ v3.1: link to the quant result
  26. Human Glioblastoma Multiforme: 3’v3 Whole Transcriptome Analysis: link to the quant result
  27. 1k Brain Cells from an E18 Mouse (v3 chemistry): link to the quant result
  28. 10k Brain Cells from an E18 Mouse (v3 chemistry): link to the quant result
  29. 1k Heart Cells from an E18 mouse (v3 chemistry): link to the quant result
  30. 10k Heart Cells from an E18 mouse (v3 chemistry): link to the quant result
  31. 10k Mouse E18 Combined Cortex, Hippocampus and Subventricular Zone Cells, Single Indexed: link to the quant result
  32. 10k Mouse E18 Combined Cortex, Hippocampus and Subventricular Zone Cells, Dual Indexed: link to the quant result
  33. 1k PBMCs from a Healthy Donor (v2 chemistry): link to the quant result
  34. 1k Brain Cells from an E18 Mouse (v2 chemistry): link to the quant result
  35. 1k Heart Cells from an E18 mouse (v2 chemistry): link to the quant result

About

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Nextflow 100.0%