Format of input matrix
sdwien opened this issue · comments
I am trying consensusMIBC for the first time, and it works fine with the example data, but not with my own data. My input matrix looks like this:
head(input)
ENSG MH01001 MH01002 MH01003 MH01004 MH01005 MH01006
1 ENSG00000223972 0.000000 0 0.000000 0.000000 0.0000000 0.0000000
2 ENSG00000227232 1.757685 0 2.744584 2.389094 0.0000000 0.7287099
3 ENSG00000278267 0.000000 0 2.090266 1.824959 0.7022326 0.0000000
4 ENSG00000243485 0.000000 0 0.000000 0.000000 0.0000000 0.0000000
5 ENSG00000284332 0.000000 0 0.000000 0.000000 0.0000000 0.0000000
6 ENSG00000237613 0.000000 0 0.000000 0.000000 0.0000000 0.0000000
My command was:
consensusresults <- getConsensusClass(input,gene_id="ensembl_gene_id")
And I am getting this error:
Error in getConsensusClass(input, gene_id = "ensembl_gene_id") :
Empty intersection between profiled genes and the genes used for consensus classification.
Make sure that gene names correspond to the type of identifiers specified by the gene_id argument
How does the input format have to be? Does the first column have to have a specific name (here ENSG)?
Many thanks for your suggestions.
Best, Sophia
I found out that for some genes, because I worked with the ENSG's with version number (such as ENSG00000223972.5), after removing the version number, I had duplicate rows for the same ENSG. Apparently, that was the problem. After unique-ing those genes, it worked for me.
Best, Sophia