rcastelo / GSVA

Gene set variation analysis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Implement a missing data policy for ssGSEA

rcastelo opened this issue · comments

The current implementation of ssGSEA propagates missing (NA) values to the result. In the context of proteomics data, as illustrated in #168, it may become useful to have a missing data policy that includes performing calculations without propagating NA values. Here we will do a first implementation of such a policy, in a similar way to the one of the base R cor() function, exposed to the user by adding to the ssGSEA parameter constructor function ssgseaParam(), a new parameter called use that takes as value a character string among the following ones:

  • "everything" (default): NAs will propagate so that the resulting values will be NA whenever one or more of the input expression values are NA, giving a warning when that happens. This is the current behavior of the ssGSEA method in GSVA.
  • "all.obs": the presence of NAs in the input expression values will produce an error.
  • "na.rm": NAs in the input expression values will be removed from calculations, giving a warning when that happens, and giving an error if no values are left after removing NAs.