icbi-lab / infercnvpy

Intratumoral heterogeneity scores based on CNAs and gene expressions
The calculations of intratumoral heterogeneity scores were inspired by a previous study and modified as follows35. First, to calculate ITHCNA, we used the relative expression value matrix generated by inferCNV and calculated the pairwise cell–cell distances using Pearson's correlation coefficients for each patient. ITHCNA was defined as interquartile range (IQR) of the distribution for all malignant cell pairs' Pearson's correlation coefficients. Similarly, we also used gene expression profiles of cancer cells of each patient to construct the distribution of the intratumoral distances. ITHGEX was assigned as the IQR of the distribution. Public single-cell lung cancer datasets GSE131907 and E-MTAB-6149 were used to calculate the ITHGEX scores of early-stage and advanced stage lung cancer.

https://www.nature.com/articles/s41467-021-22801-0#Sec11

@elisapanizzolo, would be great if you could add a PR with your implementation of ITHCNA to this library.

Here's an example of how a tool is implemented in infercnvpy:

infercnvpy/infercnvpy/tl/_infercnv.py

Lines 14 to 66 in cf40cb2

    
           def cnv_score( 
        
               adata: AnnData, 
        
               *, 
        
               obs_key: str = "cnv_leiden", 
        
               use_rep: str = "cnv", 
        
               key_added: str = "cnv_score", 
        
               inplace: bool = True, 
        
           ): 
        
               """Assign each cnv cluster a CNV score. 
        
               Clusters with a high score are likely affected by copy number abberations. 
        
               Based on this score, cells can be divided into tumor/normal cells. 
        
               Ths score is currently simply defined as the mean of result of 
        
               :func:`infercnvpy.tl.infercnv` for each cluster. 
        
               Parameters 
        
               ---------- 
        
               adata 
        
                   annotated data matrix 
        
               obs_key 
        
                   Key under which the clustering is stored in adata.obs. Usually 
        
                   the result of :func:`infercnvpy.tl.leiden`, but could also be 
        
                   other clusters, e.g. obtained from transcriptomics data. 
        
               use_rep 
        
                   Key under which the result of :func:`infercnvpy.tl.infercnv` is stored 
        
                   in adata. 
        
               key_added 
        
                   Key under which the score will be stored in `adata.obs`. 
        
               inplace 
        
                   If True, store the result in adata, otherwise return it. 
        
               Returns 
        
               ------- 
        
               Depending on the value of `inplace`, either returns `None` or a vector 
        
               with scores. 
        
               """ 
        
               if obs_key not in adata.obs.columns and obs_key == "cnv_leiden": 
        
                   raise ValueError( 
        
                       "`cnv_leiden` not found in `adata.obs`. Did you run `tl.leiden`?" 
        
                   ) 
        
               cluster_score = { 
        
                   cluster: np.mean( 
        
                       np.abs(adata.obsm[f"X_{use_rep}"][adata.obs[obs_key] == cluster, :]) 
        
                   ) 
        
                   for cluster in adata.obs[obs_key].unique() 
        
               } 
        
               score_array = np.array([cluster_score[c] for c in adata.obs[obs_key]]) 
        
               if inplace: 
        
                   adata.obs[key_added] = score_array 
        
               else: 
        
                   return score_array

We can discuss details after the holidays.

Cheers,
Gregor

Never mind, I already implemented it here based on @elisapanizzolo's code.

Will eventually add it to this repo.

	def cnv_score(
	adata: AnnData,
	*,
	obs_key: str = "cnv_leiden",
	use_rep: str = "cnv",
	key_added: str = "cnv_score",
	inplace: bool = True,
	):
	"""Assign each cnv cluster a CNV score.

	Clusters with a high score are likely affected by copy number abberations.
	Based on this score, cells can be divided into tumor/normal cells.

	Ths score is currently simply defined as the mean of result of
	:func:`infercnvpy.tl.infercnv` for each cluster.

	Parameters
	----------
	adata
	annotated data matrix
	obs_key
	Key under which the clustering is stored in adata.obs. Usually
	the result of :func:`infercnvpy.tl.leiden`, but could also be
	other clusters, e.g. obtained from transcriptomics data.
	use_rep
	Key under which the result of :func:`infercnvpy.tl.infercnv` is stored
	in adata.
	key_added
	Key under which the score will be stored in `adata.obs`.
	inplace
	If True, store the result in adata, otherwise return it.

	Returns
	-------
	Depending on the value of `inplace`, either returns `None` or a vector
	with scores.
	"""
	if obs_key not in adata.obs.columns and obs_key == "cnv_leiden":
	raise ValueError(
	"`cnv_leiden` not found in `adata.obs`. Did you run `tl.leiden`?"
	)
	cluster_score = {
	cluster: np.mean(
	np.abs(adata.obsm[f"X_{use_rep}"][adata.obs[obs_key] == cluster, :])
	)
	for cluster in adata.obs[obs_key].unique()
	}
	score_array = np.array([cluster_score[c] for c in adata.obs[obs_key]])

	if inplace:
	adata.obs[key_added] = score_array
	else:
	return score_array

Add a tool for ITHCNA