zwdzwd / sesame

🍪 SEnsible Step-wise Analysis of DNA MEthylation BeadChips

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DML: sample names not used, order in beta and meta must match

abenjak opened this issue · comments

sesame 1.16.1
Hello,

The DML function requires that the order of rows in the sample information data frame (meta) matches the order of columns in the beta value matrix. Colnames and rownames do not matter!

That's fine, but this requirement is not mentioned in the description of this function. I naively assumed that it is enough that samples in betas be present anywhere in meta, so I ended up comparing the wrong groups of samples, and realised the problem only in downstream analyses.
I suggest to clarify this in the DML description to avoid users making the same mistake as me.
A more sophisticated feature of DML could be to sanity-check that colnames(beta)==rownames(meta), or similar.

Cheers,
Andrej

Thanks for the suggestion @abenjak I will add this sanity check and the description in the next version.

Actually the sanity check may be tricky since people may not be using the same column and row names (or row names at all since tibble is discouraging that). I added the following instead stopifnot(nrow(meta) == ncol(betas))

Thanks @zwdzwd for your promt reply!