different results depending on time-span

Question

different results depending on time-span

MKaroly001 opened this issue 7 months ago · comments

Hello,
I am analyzing multivariate time series, I want to find point-anomalies (outliers) in them.
I am using 6 detectors (KNN, LOF, COPOD, PCA, OCSVM, INNE) together with SUOD.
I am training it on roughly 100k time steps (data points).
I try to infere time series, that were previously unseen with the above mentioned model (other time slice).
The result seems to be depending on the time-span that I infere (I infered unseen 100k data point and I am using this as baseline, then I interfered ~200 data point, ~20 data point from the same time range)
I Plotted the anomaly scores and they are not agreeing.
Even qualitatively not understandable the match: if there is a data point with high anomaly score in the 'short' interfered data, this almost every time can be seen on the 'long' data, but not all the higher anomaly scores can be seen on the 'short' data.
Is it an expected behavior?
(on the attached file the blue line is the relevant slice from the 'long' data, the red is the 'short' data)

Thanks for the answer.
Regards,
K.

Andrew Maguire · Answer 1 · Fri Oct 27 2023 21:41:12 GMT+0800 (China Standard Time)

can you make a small reproducible example? https://en.wikipedia.org/wiki/Minimal_reproducible_example

MKaroly001 · Answer 2 · Sat Oct 28 2023 23:30:21 GMT+0800 (China Standard Time)

Thank You, for your resply.
I wrote a minimal example and it works as I expected (the result is independent on the time range).
I have to find the reason for the discrepancy in my own code.
Thank You for your quick response.

Regards,
K