minSize and maxSize
ag1805x opened this issue · comments
What is the effect of minSize and maxSize in fgsea()? The documentation mentions "All pathways below/above the threshold are excluded". If the maxSize is set to 500 and one of the pathways has more than 500 genes, should I get an enrichment score for that pathway? I was trying to understand the change it brings but observed that the results stay the same.
library(fgsea)
data(examplePathways)
data(exampleRanks)
set.seed(42)
examplePathways <- examplePathways[lengths(examplePathways) > 500]
fgseaRes_15_100 <- fgsea(pathways = examplePathways,
stats = exampleRanks,
minSize = 15,
maxSize = 100)
fgseaRes_15_500 <- fgsea(pathways = examplePathways,
stats = exampleRanks,
minSize = 15,
maxSize = 500)
fgseaRes_15_1000 <- fgsea(pathways = examplePathways,
stats = exampleRanks,
minSize = 15,
maxSize = 1000)
fgseaRes_15_Inf <- fgsea(pathways = examplePathways,
stats = exampleRanks,
minSize = 15,
maxSize = Inf)
Only in the case of maxSize = 100
, I get an empty data table. When maxSize = 1000
, one term with 628 genes is missing.
@ag1805x Hi
The minSize
and maxSize
arguments control the size of the pathways that will be used for analysis. When analyzing pathways, the size is not simply the number of genes it contains, but rather the size of the intersection between the genes from the pathway and names(exampleRanks)
. You can verify that all values in the size
column of fgsea
results lie between minSize
and maxSize
.