johnantonn / cash-for-unsupervised-ad

Systematic Evaluation of CASH Search Strategies for Unsupervised Anomaly Detection

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hypotheses and visualization

johnantonn opened this issue · comments

Decide on which hypotheses you are going to focus on and the respective experiments required, as well as the appropriate visualization of results for each one.

Hypotheses:

  1. Guided search algorithms (SMAC, Hyperband etc.) will ahve better performance than unguided search and other baselines like random search or random proportional search.

    • The search algorithms can be executed against a number of datasets and include a number of PyOD anomaly detection algorithms (CBLOF, KNN, LOF, IForest, COPOD).
  2. Stratified validation sets will result in better performance compared to biased validation sets.

    • For now, this hypothesis cannot be tested for Successive Halving and Hyperband, since the PredefinedSplit resampling strategy results in crashes/errors.
  3. Larger validation sets (larger amounts of labeled data) will result in better performance compared to smaller validation sets.

    • Larger validation sets can be obtained from versions of the same dataset that include a higher ratio of outliers.
    • The datasets provided here come in several versions characterized by the outlier percentage. Different versions of the dataset include the same negative examples and different number of positive examples (outliers), based on the required percentage. Two approaches could be followed:
      • A first naive approach would just take the various versions of the dataset and continue without explicitly acting on them. That would lead to validation sets with incrementally more outliers.
      • Another, more sophisticated approach, would be to take the version with the smallest percentage of outliers and incrementally build validation sets with higher percentages, leaving the training set as is. That way, no positive examples would be wasted by being included in the training set during the original split.

A critical parameter of the experiments would be the split percentages of train/valid/test.