yzhao062 / pyod

A Comprehensive and Scalable Python Library for Outlier Detection (Anomaly Detection)

Home Page:http://pyod.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Quasi-Monte Carlo Discrepancy always predicts an outlier

Hellsice opened this issue · comments

I've found that the QMCD model will always predict at least one outlier due to the normalization of its decision scores.
This results in the model not performing at all if there are no outliers in the dataset.
Is this intentional? If so, why was it implemented like this?

Hi @Hellsice great question and I see your concern. The normalization of the decision scores was done since QMCD sometimes tends to identify the outlier class as having the lower scores. Normalizing allows for a simple test to flip the results if this happens. However, I myself have noticed that this simple test is not very robust and perhaps a better sense check would be to rather check the skewness of the scores' distribution and flip if it is highly skewed to the left. Will investigate this.