NaN output
pietrisko opened this issue · comments
Hello,
when I use TDM with counts from RNA-Seq to normalize the data with microarrays as reference I get a matrix of NaN, like this
gene c1 c2 c3 c4 c5 c6 c7 c8
1: ISG15 NaN NaN NaN NaN NaN NaN NaN NaN
2: AGRN NaN NaN NaN NaN NaN NaN NaN NaN
3: SCNN1D NaN NaN NaN NaN NaN NaN NaN NaN
4: VWA1 NaN NaN NaN NaN NaN NaN NaN NaN
5: MMP23B NaN NaN NaN NaN NaN NaN NaN NaN
6: GABRD NaN NaN NaN NaN NaN NaN NaN NaN
7: PRKCZ NaN NaN NaN NaN NaN NaN NaN NaN
8: SKI NaN NaN NaN NaN NaN NaN NaN NaN
9: RER1 NaN NaN NaN NaN NaN NaN NaN NaN
10: PLCH2 NaN NaN NaN NaN NaN NaN NaN NaN
but, if I use tpm data, TDM works, even if the distribution does not fit the distribution of microarray (indeed you suggest to use raw-counts).
So where is the problem?
Counts:
ISG15 0 38 0 0 0 0 0 0
AGRN 0 0 0 0 0 0 0 0
SCNN1D 0 0 0 0 0 0 0 0
VWA1 0 0 0 0 0 0 0 0
MMP23B 0 0 0 0 0 0 0 0
GABRD 0 0 0 0 0 0 0 0
PRKCZ 0 0 0 0 0 0 0 0
SKI 0 0 0 319 0 0 0 2
RER1 0 0 0 0 0 94 0 0
PLCH2 0 0 0 0 0 0 0 0
TPM:
ISG15 0.00000 0.0000 0.000000 0.00000 0.00000 0.00000 0.0000 3.30840 0.0000 0.00000
AGRN 0.00000 0.0000 0.077243 0.00000 0.00000 0.00000 0.0000 0.20914 0.0000 0.08134
SCNN1D 0.00000 0.0000 0.000000 0.00000 0.14143 0.00000 1.1972 0.00000 0.0000 0.00000
VWA1 0.00000 0.0000 0.000000 0.00000 0.99639 0.00000 0.0000 0.00000 0.0000 0.00000
MMP23B 0.00000 0.0000 0.000000 0.00000 0.00000 0.00000 0.0000 0.00000 0.0000 2.18430
GABRD 0.00000 0.0000 0.000000 0.00000 0.00000 0.00000 0.0000 0.00000 0.0000 0.00000
PRKCZ 0.00000 0.0000 0.436160 0.00000 0.00000 0.00000 0.0000 0.00000 0.0000 0.00000
SKI 0.17505 2.1421 0.771460 0.58207 2.68910 0.35389 1.7557 2.00680 1.5018 2.14800
RER1 2.23510 0.0000 3.912600 1.87300 2.35610 2.73420 4.3066 0.00000 2.1365 0.00000
PLCH2 0.00000 0.0000 0.000000 0.00000 0.00000 0.00000 0.0000 0.00000 0.0000 0.00000
Thanks for letting us know, I'll get to the bottom of it.
Hi, I am also having the same problem when I use count data from RNAseq. I was wondering if this has been resolved? thanks
Hi. I am facing the same issues explained above using raw read count and log-transformed read count values. I get a dataframe full of NaN. I was wondering if anyone has had a look into this.
Thanks!
Hi, I'm also facing the same issue. Just want to hear if anyone figured out why?
@jeffreyat have you looked into it?
Thanks!
I have looked at it, although I guess we went to email and I forgot to update here. Unfortunately, I have been unable to reproduce the problem. If someone can produce the problem with a small dataset they can send, it would greatly help!
Would any of you be willing to share a subset of the data that produce the problem so that @jeffreyat can dig into it? Ideally also with package version information, etc.