infercnvpy.tl.infercnv result matrix is of incorrect dimensionality

Question

infercnvpy.tl.infercnv result matrix is of incorrect dimensionality

sean-at-tessera opened this issue 5 months ago · comments

Report

The matrix added to my AnnData object by infercnvpy in anndata.obsm['X_cnv'] has an unexpected number of columns compared to the original input matrix (anndata.X); these values were 6671 and 6965 respectively.

I assumed that the matrix would be reduced to the number of windows over which CNV values were calculated. I used windowsize=100 and step=1, so I would have guessed I would have seen 6866 columns.

Is the anndata.obsm['X_cnv'] representing something other than the CNV scores for the windows? Also, is there a way to annotate the columns in this matrix with the corresponding chromosome?

Version information

No response

Gregor Sturm · Answer 1 · Sun Feb 04 2024 04:20:02 GMT+0800 (China Standard Time)

Hi @sean-at-tessera,

thanks for your questions.

The matrix added to my AnnData object by infercnvpy in anndata.obsm['X_cnv'] has an unexpected number of columns compared to the original input matrix (anndata.X); these values were 6671 and 6965 respectively.

I assumed that the matrix would be reduced to the number of windows over which CNV values were calculated. I used windowsize=100 and step=1, so I would have guessed I would have seen 6866 columns.

The running mean is computed using np.convolve with mode="same". That means for each chromosome the number of the columns in the CNV matrix is identical to the number of genes in that chromosome, independent of the window size (At the beginning and the end of the chromosome, the mean is computed on less genes than the window size).

The reason why you see the discrepancy is that X and Y chromosomes are excluded by default.

Also, is there a way to annotate the columns in this matrix with the corresponding chromosome?

There's a chr_pos field in adata.uns that gives you the starting index for each chromosome

>>> adata.uns['cnv']['chr_pos']
{'chr1': 0,
 'chr2': 508,
 'chr3': 894,
 'chr4': 1192,
 'chr5': 1434,
 'chr6': 1711,
 'chr7': 1999,
 'chr8': 2275,
 'chr9': 2501,
 'chr10': 2715,
 'chr11': 2932,
 'chr12': 3248,
 'chr13': 3534,
 'chr14': 3661,
 'chr15': 3857,
 'chr16': 4054,
 'chr17': 4292,
 'chr18': 4575,
 'chr19': 4690,
 'chr20': 4970,
 'chr21': 5106,
 'chr22': 5185}

Hope that helps!

Sean Corbett · Answer 2 · Tue Feb 06 2024 05:19:09 GMT+0800 (China Standard Time)

Thank you @grst for your clear and helpful answer!