DML: sample names not used, order in beta and meta must match
abenjak opened this issue · comments
sesame 1.16.1
Hello,
The DML
function requires that the order of rows in the sample information data frame (meta) matches the order of columns in the beta value matrix. Colnames and rownames do not matter!
That's fine, but this requirement is not mentioned in the description of this function. I naively assumed that it is enough that samples in betas be present anywhere in meta, so I ended up comparing the wrong groups of samples, and realised the problem only in downstream analyses.
I suggest to clarify this in the DML
description to avoid users making the same mistake as me.
A more sophisticated feature of DML
could be to sanity-check that colnames(beta)==rownames(meta)
, or similar.
Cheers,
Andrej
Thanks for the suggestion @abenjak I will add this sanity check and the description in the next version.
Actually the sanity check may be tricky since people may not be using the same column and row names (or row names at all since tibble is discouraging that). I added the following instead stopifnot(nrow(meta) == ncol(betas))
Thanks @zwdzwd for your promt reply!