Replace current outlier test by the Bonferroni Outlier Test

Question

Replace current outlier test by the Bonferroni Outlier Test

schnorr opened this issue 6 years ago · comments

As of today, the outlier detection mechanism is nonexistent for sparse linear algebra (qrmumps) and based on the inter quantile range (IQR) for the dense linear algebra (cholesky). IQR is weak because there is no performance model to anticipate expected behavior, but we do have fair cholesky and qrmumps perf. models, enabling us to use the Bonferroni Outlier Test (available in the the car package with the outlierTest function). Here's a code snippet to classify tasks as outliers once the outlierTest function has been called with a model:

 out <- outlierTest(fit, n.max=Inf)
 out.tibble <- tibble(Order = out$bonf.p %>% names,
                     Bonferonni = out$bonf.p) %>%
    filter(Bonferonni < 0.5)
 df %>%
    mutate(Order = 1:n()) %>%
    mutate(Outlier = case_when(Order %in% out.tibble$Order ~ TRUE,
                               TRUE ~ FALSE)) %>%
    select(-Order)

Where fit contains the model. Note that the order of observations given to the model is important, since the outlierTest reports outliers based on their indexes. So we need to create that order again with the original df observations and then use the set of observations detected as outliers by Bonferroni. Scalability of this approach is yet to be evaluated.