yzhao062 / pyod

A Comprehensive and Scalable Python Library for Outlier Detection (Anomaly Detection)

Home Page:http://pyod.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DIF model: duplicate normalization

ValeZ1 opened this issue · comments

On the DIF (Deep Isolation Forest) model, in the fit function, the variable X is normalized. Then it is passed to decision_function to compute the decision_scores_, where it is normalized again. This results in a mismatch between decision_scores_ and scores obtained by calling decision_function(X) on the same X.

Normalization:

pyod/pyod/models/dif.py

Lines 173 to 175 in 690a0f2

self.minmax_scaler = MinMaxScaler()
self.minmax_scaler.fit(X)
X = self.minmax_scaler.transform(X)

decision_function call:

pyod/pyod/models/dif.py

Lines 215 to 216 in 690a0f2

self.decision_scores_ = self.decision_function(X)
self._process_decision_scores()

Normalization in decision_function:

X = self.minmax_scaler.transform(X)