greenelab / TDM

R package for normalizing RNA-seq data to make them comparable to microarray data.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NaN output

pietrisko opened this issue · comments

Hello,

when I use TDM with counts from RNA-Seq to normalize the data with microarrays as reference I get a matrix of NaN, like this

      gene             c1            c2             c3             c4             c5             c6              c7             c8
 1:  ISG15            NaN            NaN            NaN            NaN            NaN            NaN             NaN            NaN
 2:   AGRN            NaN            NaN            NaN            NaN            NaN            NaN             NaN            NaN
 3: SCNN1D            NaN            NaN            NaN            NaN            NaN            NaN             NaN            NaN
 4:   VWA1            NaN            NaN            NaN            NaN            NaN            NaN             NaN            NaN
 5: MMP23B            NaN            NaN            NaN            NaN            NaN            NaN             NaN            NaN
 6:  GABRD            NaN            NaN            NaN            NaN            NaN            NaN             NaN            NaN
 7:  PRKCZ            NaN            NaN            NaN            NaN            NaN            NaN             NaN            NaN
 8:    SKI            NaN            NaN            NaN            NaN            NaN            NaN             NaN            NaN
 9:   RER1            NaN            NaN            NaN            NaN            NaN            NaN             NaN            NaN
10:  PLCH2            NaN            NaN            NaN            NaN            NaN            NaN             NaN            NaN

but, if I use tpm data, TDM works, even if the distribution does not fit the distribution of microarray (indeed you suggest to use raw-counts).

So where is the problem?

Counts:

ISG15               0             38              0              0              0              0               0              0
AGRN                0              0              0              0              0              0               0              0
SCNN1D              0              0              0              0              0              0               0              0
VWA1                0              0              0              0              0              0               0              0
MMP23B              0              0              0              0              0              0               0              0
GABRD               0              0              0              0              0              0               0              0
PRKCZ               0              0              0              0              0              0               0              0
SKI                 0              0              0            319              0              0               0              2
RER1                0              0              0              0              0             94               0              0
PLCH2               0              0              0              0              0              0               0              0

TPM:

ISG15       0.00000       0.0000     0.000000      0.00000      0.00000      0.00000       0.0000      3.30840       0.0000      0.00000
AGRN        0.00000       0.0000     0.077243      0.00000      0.00000      0.00000       0.0000      0.20914       0.0000      0.08134
SCNN1D      0.00000       0.0000     0.000000      0.00000      0.14143      0.00000       1.1972      0.00000       0.0000      0.00000
VWA1        0.00000       0.0000     0.000000      0.00000      0.99639      0.00000       0.0000      0.00000       0.0000      0.00000
MMP23B      0.00000       0.0000     0.000000      0.00000      0.00000      0.00000       0.0000      0.00000       0.0000      2.18430
GABRD       0.00000       0.0000     0.000000      0.00000      0.00000      0.00000       0.0000      0.00000       0.0000      0.00000
PRKCZ       0.00000       0.0000     0.436160      0.00000      0.00000      0.00000       0.0000      0.00000       0.0000      0.00000
SKI         0.17505       2.1421     0.771460      0.58207      2.68910      0.35389       1.7557      2.00680       1.5018      2.14800
RER1        2.23510       0.0000     3.912600      1.87300      2.35610      2.73420       4.3066      0.00000       2.1365      0.00000
PLCH2       0.00000       0.0000     0.000000      0.00000      0.00000      0.00000       0.0000      0.00000       0.0000      0.00000

Thanks for letting us know, I'll get to the bottom of it.

Hi, I am also having the same problem when I use count data from RNAseq. I was wondering if this has been resolved? thanks

Hi. I am facing the same issues explained above using raw read count and log-transformed read count values. I get a dataframe full of NaN. I was wondering if anyone has had a look into this.
Thanks!

Hi, I'm also facing the same issue. Just want to hear if anyone figured out why?

@jeffreyat have you looked into it?

Thanks!

I have looked at it, although I guess we went to email and I forgot to update here. Unfortunately, I have been unable to reproduce the problem. If someone can produce the problem with a small dataset they can send, it would greatly help!

Would any of you be willing to share a subset of the data that produce the problem so that @jeffreyat can dig into it? Ideally also with package version information, etc.