AlicePsyche / scRNAseq-analysis-notes

my scRNAseq analysis notes

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

scRNAseq-analysis-notes

my scRNAseq analysis notes

The reason

Single cell RNAseq is becoming more and more popular, and as a technique, it might become as common as PCR. I just got some 10x genomics single cell RNAseq data to play with, it is a good time for me to take down notes here. I hope it is useful for other people as well.

readings before doing anything

single cell tutorials

single cell RNA-seq normalization

single cell batch effect

Single cell RNA-seq

Considerable differences are found between the methods in terms of the number and characteristics of the genes that are called differentially expressed. Pre-filtering of lowly expressed genes can have important effects on the results, particularly for some of the methods originally developed for analysis of bulk RNA-seq data. Generally, however, methods developed for bulk RNA-seq analysis do not perform notably worse than those developed specifically for scRNA-seq.

single cell RNA-seq clustering

  • Geometry of the Gene Expression Space of Individual Cells
  • pcaReduce: Hierarchical Clustering of Single Cell Transcriptional Profiles.
  • CountClust: Clustering and Visualizing RNA-Seq Expression Data using Grade of Membership Models. Fits grade of membership models (GoM, also known as admixture models) to cluster RNA-seq gene expression count data, identifies characteristic genes driving cluster memberships, and provides a visual summary of the cluster memberships
  • FastProject: A Tool for Low-Dimensional Analysis of Single-Cell RNA-Seq Data
  • SNN-Cliq Identification of cell types from single-cell transcriptomes using a novel clustering method
  • Compare clusterings for single-cell sequencing bioconductor package.The goal of this package is to encourage the user to try many different clustering algorithms in one package structure. We give tools for running many different clusterings and choices of parameters. We also provide visualization to compare many different clusterings and algorithm tools to find common shared clustering patterns.
  • CIDR: Ultrafast and accurate clustering through imputation for single cell RNA-Seq data
  • SC3- consensus clustering of single-cell RNA-Seq data. SC3 achieves high accuracy and robustness by consistently integrating different clustering solutions through a consensus approach. Tests on twelve published datasets show that SC3 outperforms five existing methods while remaining scalable, as shown by the analysis of a large dataset containing 44,808 cells. Moreover, an interactive graphical implementation makes SC3 accessible to a wide audience of users, and SC3 aids biological interpretation by identifying marker genes, differentially expressed genes and outlier cells.
  • GiniClust2: a cluster-aware, weighted ensemble clustering method for cell-type detection
  • FateID infers cell fate bias in multipotent progenitors from single-cell RNA-seq data
  • matchSCore: Matching Single-Cell Phenotypes Across Tools and Experiments In this work we introduce matchSCore (https://github.com/elimereu/matchSCore), an approach to match cell populations fast across tools, experiments and technologies. We compared 14 computational methods and evaluated their accuracy in clustering and gene marker identification in simulated data sets.
  • Cluster Headache: Comparing Clustering Tools for 10X Single Cell Sequencing Data
  • The celaref (cell labelling by reference) package aims to streamline the cell-type identification step, by suggesting cluster labels on the basis of similarity to an already-characterised reference dataset - wheather that's from a similar experiment performed previously in the same lab, or from a public dataset from a similar sample.

dimention reduction and visualization of clusters

See https://t.co/yxCb85ctL1: "MDS best choice for preserving outliers, PCA for variance, & T-SNE for clusters" @mikelove @AndrewLBeam

— Rileen Sinha (@RileenSinha) August 25, 2016
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>

paper: Outlier Preservation by Dimensionality Reduction Techniques

"MDS best choice for preserving outliers, PCA for variance, & T-SNE for clusters"

papers

advance of scRNA-seq tech

The field is advancing so fast!!

check this website for the tools being added:
https://www.scrna-tools.org/

contamination of 10x data

https://twitter.com/constantamateur/status/994832241107849216?s=11

Did you know that droplet based single cell RNA-seq data (like 10X) is contaminated by ambient mRNAs? Good news though, we've written a paper (https://www.biorxiv.org/content/early/2018/04/20/303727 …) and created an R package called SoupX (https://github.com/constantAmateur/SoupX) to fix this problem.

Is this really a problem? It depends on your experiment. Contamination ranges from 2% - 50%. 10% seems common; it's 8% for 10X PBMC data. Solid tissues are typically worse, but there's no way to know in advance. Wouldn't you like to know how contaminated your data are?

These mRNAs come from the single cell suspension fed into the droplet creation system. They mostly get their from lysed cells and so resemble the cells being studied. This means the profile of the contamination is experiment specific and creates a batch effect.

cellranger is the toolkit developed by the 10x genomics company to deal with the data.

About

my scRNAseq analysis notes

License:MIT License