icbi-lab / infercnvpy

Infer copy number variation (CNV) from scRNA-seq data. Plays nicely with Scanpy.

Home Page:https://infercnvpy.readthedocs.io/en/latest/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Accessing Metadata from AnnData

shahrozeabbas opened this issue · comments

Hello,

I am attempting to export metadata after running inferCNV. I am able to export the CNV score and the Leiden clusters, however I would like to access everything that is also available in the R version such as the loss or gain of each chromosome for each cell. The R version seems to release a large table (~200) columns with data for each chromosome. Is it possible to access this somehow using the python version?

Hi,

the matrix with CNV scores is stored in

> adata.obsm["X_cnv"]
<184x9111 sparse matrix of type '<class 'numpy.float64'>'
	with 214913 stored elements in Compressed Sparse Row format>

where each row is a cell and each column a genomic region.

Additionally, there's information which columns in this matrix belong to which chromosome in

> adata.uns["cnv"]["chr_pos"]
{'chr1': 0,
 'chr2': 915,
 'chr3': 1574,
 'chr4': 2141,
 'chr5': 2454,
 'chr6': 2902,
 'chr7': 3394,
 'chr8': 3874,
 'chr9': 4195,
 'chr10': 4564,
 'chr11': 4955,
 'chr12': 5494,
 'chr13': 6009,
 'chr14': 6179,
 'chr15': 6499,
 'chr16': 6791,
 'chr17': 7209,
 'chr18': 7787,
 'chr19': 7928,
 'chr20': 8523,
 'chr21': 8781,
 'chr22': 8880}

i.e. in this example

adata.obsm["X_cnv"][:, 0:915]

contains the scores for chr1.

hope that helps,
Gregor

Hello,

Yes this is helpful, thanks! However, it looks like this info is a superset of the table described here. Is there a way to acquire the 'map_metadata_from_infercnv.txt' described in that link directly from the infercnv object? Either that or maybe is there a way to calculate these data from what's available in data.obsm["X_cnv"]?

Thank you for your help,
Shahroze

Unfortunately, segmentation (e.g. using HMM) is currently not implemented in infercnvpy (See also #1).
In principle, you can aggregate the CNV matrix, if you are interested in a certain region, e.g. indices 915:1200 (roughly) refer to the first half of chromosome 2. If you are interested in this region, you could do

cnv_mat = adata.obsm["X_cnv"]
chr2_score = np.mean(cnv_mat[:, 915:1200], axis=1)

to get a score for each cell.

Hello~
I got into some trouble. I wanna get the cnv region in chr8 del.I wanna konw which genes del in chr8. but now through the 'X_cnv', I can just get the number but not the correct geneID. And metadata 'chromosome' is not paired with 'X_cnv'.I run infercnvpy with exclude "X,Y,MT,nan",but the number is wrong.chr14:170genes but in the ['chr_pos']:chr14:180

>>adata.var['chromosome'].value_counts()
chr1      525
chr2      347
chr17     308
chr19     307
chr11     301
chr12     287
chr3      281
chr6      277
chr5      240
chr7      237
chr16     207
chr10     192
chr4      188
chr9      170
chr14     170
chr8      169
chrX      152
chr15     147
chr20     135
chr22     128
chr13      90
chr18      68
chr21      49
chrMT      18
chrnan      3

chr1 0
chr2 525
chr3 872
chr4 1153
chr5 1341
chr6 1581
chr7 1858
chr8 2095
chr9 2264
chr10 2434
chr11 2626
chr12 2927
chr13 3214
chr14 3313
chr15 3483
chr16 3630
chr17 3837
chr18 4145
chr19 4244
chr20 4551
chr21 4686
chr22 4785

So could you add the geneID in the ['X_cnv'].or maybe other ways to get the CNV region.Thanks so much.Waiting for your reply.

Hi @zhangpebbels,

yes, the metadata in var does not correspond to the data in X_cnv, as one is based on genes and the other on bins that may consist of multiple genes.

@redst4r has been working on a feature to retrieve genes for each bin in #58. But there are still some tests failing and I'm not entirely sure what the status of that PR is.

Hi, @grst @redst4r

Thank you for sharing great wrapper for infercnvpy. I've been trying to annotate matching gene on heatmap plot (c.f., bottom for all/subset of matching gene symbols) but tuning on show_gene_labels=True only show relevant segment. I wonder is there any work around solution I can try? Possibly @redst4r already found solution but forgot to update repo? Any tips would be much appreciated :-)

best,
Jun