yzhao062 / pyod

A Comprehensive and Scalable Python Library for Outlier Detection (Anomaly Detection)

Home Page:http://pyod.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

different results depending on time-span

MKaroly001 opened this issue · comments

Hello,
I am analyzing multivariate time series, I want to find point-anomalies (outliers) in them.
I am using 6 detectors (KNN, LOF, COPOD, PCA, OCSVM, INNE) together with SUOD.
I am training it on roughly 100k time steps (data points).
I try to infere time series, that were previously unseen with the above mentioned model (other time slice).
The result seems to be depending on the time-span that I infere (I infered unseen 100k data point and I am using this as baseline, then I interfered ~200 data point, ~20 data point from the same time range)
I Plotted the anomaly scores and they are not agreeing.
Even qualitatively not understandable the match: if there is a data point with high anomaly score in the 'short' interfered data, this almost every time can be seen on the 'long' data, but not all the higher anomaly scores can be seen on the 'short' data.
Is it an expected behavior?
(on the attached file the blue line is the relevant slice from the 'long' data, the red is the 'short' data)
week1

Thanks for the answer.
Regards,
K.

Thank You, for your resply.
I wrote a minimal example and it works as I expected (the result is independent on the time range).
I have to find the reason for the discrepancy in my own code.
Thank You for your quick response.

Regards,
K