ghuls / cellsnp-lite

Efficient genotyping bi-allelic SNPs on single cells

Home Page:https://cellsnp-lite.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

cellsnp-lite

conda platforms license

cellsnp-lite was initially designed to pileup the expressed alleles in single-cell or bulk RNA-seq data, which can be directly used for donor deconvolution in multiplexed single-cell RNA-seq data, particularly with vireo, which assigns cells to donors and detects doublets, even without genotyping reference. Now besides RNA-seq data, cellsnp-lite could also be applied on DNA-seq and ATAC-seq data, either in bulk or single-cell.

cellsnp-lite heavily depends on htslib. This program should give very similar results as samtools/bcftools mpileup. Also, there are two major differences comparing to bcftools mpileup:

  1. cellsnp-lite can now pileup a list of positions, with directly splitting into a list of cell barcodes, e.g., for 10x genome. With bcftools, you may need to manipulate the RG tag in the bam file if you want to divide reads into cell barcode groups.
  2. cellsnp-lite uses simple filtering for outputting SNPs, i.e., total UMIs or counts and minor alleles fractions. The idea here is to keep most information of SNPs and the downstream statistical model can take the full use of it.

cellsnp-lite is the C version of cellSNP, which is implemented in Python. Compared to cellSNP, cellsnp-lite is basically more efficient with higher speed and less memory usage. Benchmarking results could be found in the preprint. Note, the old version, together with the latest version, of benchmarking scripts are now both in a new repo csp_benchmark.

News

All release notes can be found in doc/release.rst.

For computational efficiency, we initialised comments on this: doc/speed.rst

A pre-compiled candidate SNP list for human is at Candidate_SNPs.

Citation

If you find cellsnp-lite is useful for your research, please cite:

Xianjie Huang, Yuanhua Huang, Cellsnp-lite: an efficient tool for genotyping single cells, Bioinformatics, 2021;, btab358, https://doi.org/10.1093/bioinformatics/btab358

(previously cellsnp-lite has a preprint on bioRxiv)

Installation

cellsnp-lite is implemented in C. You can install it via conda or from this github repo.

Install via conda (latest stable version)

This is the recommended way to install cellsnp-lite. Lacking the potential issues of dependency, it's simple and fast if conda is available on the machine.

Step 1: add config

conda config --add channels bioconda
conda config --add channels conda-forge

Step 2: install

to your current environment:

conda install cellsnp-lite

or to a new environment:

conda create -n CSP cellsnp-lite     # you can replace 'CSP' with another env name.

Install from this Github Repo (latest stable/dev version)

We recommend installing cellsnp-lite via conda, as described above. To compile from source code, you could refer to install_from_repo.

Manual

The full manual is at https://cellsnp-lite.readthedocs.io.

Also, type cellsnp-lite -h for all arguments with the version you are using.

FAQ and releases

For troubleshooting, please have a look of FAQ.rst, and we welcome reporting any issue.

About

Efficient genotyping bi-allelic SNPs on single cells

https://cellsnp-lite.readthedocs.io

License:Apache License 2.0


Languages

Language:C 63.2%Language:Shell 14.9%Language:Makefile 14.7%Language:M4 3.8%Language:Python 2.3%Language:R 1.0%