alexdobin / STAR

RNA-seq aligner

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Calculation of sequencing saturation

nbartonicek opened this issue · comments

Good day,
I am trying to figure out how is sequencing saturation calculated from the basic stats in Summary.csv, since the numbers do not add up.

In the following case:
Number of Reads,72704122
Reads With Valid Barcodes,0.983831
Sequencing Saturation,0.443522
Q30 Bases in CB+UMI,0.958173
Q30 Bases in RNA read,0.938477
Reads Mapped to Genome: Unique+Multiple,0.951636
Reads Mapped to Genome: Unique,0.754136
Reads Mapped to Gene: Unique+Multiple Gene,NoMulti
Reads Mapped to Gene: Unique Gene,0.715291
Estimated Number of Cells,76
Unique Reads in Cells Mapped to Gene,51754771
Fraction of Unique Reads in Cells,0.995196
Mean Reads per Cell,680983
Median Reads per Cell,573753
UMIs in Cells,28775002
Mean UMI per Cell,378618
Median UMI per Cell,319694
Mean Gene per Cell,10198
Median Gene per Cell,10303
Total Gene Detected,19735
...
Sequencing saturation should be: 1-(unique reads in cells mapped to gene)/(reads with valid barcodes).
But that gives only 1-(51754771/(72704122*0.983831)) which is 0.276, not 0.443522.

Any help greatly appreciated,
Nenad

Hi Nenad,

the formula for calculating saturation is explained here: #2048