arogozhnikov / rtg_score

Rank-To-Group score evaluates contribution of confounding factors

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Run tests and deploy Run tests and deploy

Rank-To-Group (RTG) score evaluates contribution of confounders

Batch, cell line, donor, plate, reprogramming, protocol — these and other confounding factors influence cell cultures in vitro.

RTG score tracks contribution of different factors to variability by estimating how Rank maps To Group. Scoring relies on ranking by similarity, so there are no explicit or implicit assumptions of linearity.

RTG perfectly works with both well-interpretable data (gene expressions, cell types) and embeddings provided by deep learning.

Usage

rtg_score is a Python package. Installation:

pip install rtg_score

RTG score requires two DataFrames: one with confounds and one with embeddings (or other features, e.g. gene expressions)

from rtg_score import compute_RTG_score
# following code corresponds to computing an element of the figure above
# (exclude same organoid_id and batch+donor)
score = compute_RTG_score(
    metadata=confounders_metadata,
    include_confounders=['batch', 'donor'],
    exclude_confounders=['organoid_id'],
    embeddings=qpcr_delta_ct, 
)

Use compute_RTG_contribution_matrix to compute multiple RTG scores in a bulk .
Example + code for plotting are available in example subfolder.

Parameters

  • metadata - DataFrame with confounding variables
    • sample_id, batch, donor, clone, plate, etc.
  • embeddings - numerical description of samples (DataFrame or 2d array).
    • Examples: qPCR delta Cts, deep learning embeddings, cell types fractions
  • metric - how to define similarity?
    • use euclidean for qPCR and various embeddings and hellinger for cell type distributions
  • included and excluded confounders in example:
    • including ['donor', 'batch'] and excluding ['clone', 'plate'] will estimate how similar are samples with same donor and same batch, while omitting pairs which have same clone or grown on the same plate
    • most use-cases are simple, like include batch effect while exclude plate, but framework is very flexible

Example analysis

Preprint demonstrates application of RTG score to multimodal analysis of cerebral organoids, and demonstrates which conclusions can be drawn.

See example subfolder for an actual code.

About

Rank-To-Group score evaluates contribution of confounding factors

License:MIT License


Languages

Language:Jupyter Notebook 90.3%Language:Python 9.7%