Which algorithms of PyOD are "robust"?

Question

Which algorithms of PyOD are "robust"?

asmaier opened this issue a year ago · comments

In many cases, training with anomalies (outliers) in the (unlabeled) training data might lead to learning wrong detection models. For these cases so called robust algorithms have been developed. But I couldn't find documentation about which algorithms of PyOD are robust. For example the PyOD implementation of PCA seems to be using the sklearns PCA, which is not a robust PCA as described at https://en.wikipedia.org/wiki/Robust_principal_component_analysis or https://en.wikipedia.org/wiki/L1-norm_principal_component_analysis. So which algorithms in PyOD are really robust ?

Yue Zhao · Answer 1 · Fri May 19 2023 12:56:48 GMT+0800 (China Standard Time)

robustness is a relative term. I would recommend isolation forest as an ensemble methods. good performance and relatively good robustness.

asmaier · Answer 2 · Sat May 20 2023 04:22:20 GMT+0800 (China Standard Time)

I think there is a misunderstanding. Robustness in statistics is not a relative term. There is a whole field called robust statistics.

Robust statistics seek to provide methods that emulate popular statistical methods, but are not unduly affected by outliers or other small departures from model assumptions. (https://en.wikipedia.org/wiki/Robust_statistics)

But I agree the term robust can have a different meanings for people not familiar with that field, so let me reformulate my question:

Which algorithms of PyOD are not unduly affected by outliers?