Normalization of RNA-seq gene expression data. Supported methods are Transcript per kilobase million (TPM) and Counts per million (CPM).
The TPM normalization can either accept pre-computed gene lengths on the input or compute gene lengths from gene annotation in GTF format, using the union exon-based approach. The computed gene lengths are identical to the lengths reported by featureCounts (validated for Homo sapiens, Mus musculus, Rattus norvegicus, and Macaca mulatta of ENSEMBL and UCSC annotations).
Install rnanorm
Python package:
pip install rnanorm
See rnanorm
command help:
rnanorm --help
Run rnanorm
with pre-computed gene lengths:
rnanorm expr.tsv --cpm-output=expr.cpm.tsv --tpm-output=expr.tpm.tsv --gene-lengths=lengths.tsv
Run rnanorm
with genome annotation - gene lengths will be computed on the fly:
rnanorm expr.tsv --cpm-output=expr.cpm.tsv --tpm-output=expr.tpm.tsv --annotation=annot.gtf
Install rnanorm
Python package for development:
flit install --deps=all --symlink
Run all tests and linters:
tox