0-distances error when using PECUZAL embedding

Question

0-distances error when using PECUZAL embedding

xrisk opened this issue 3 years ago · comments

Hi, we are trying to perform an embedding with some trace data we have from a simulation. I have attached the file consisting of the data points.

However, pezucal gives us:

Initializing PECUZAL algorithm for univariate input...
Starting 1-th embedding cycle...
Computed 0-distances. You might use model-data, thus try to add minimal additive noise to the signal you wish to embed and try again.

whereas optimal_traditional_de gives us:

Algorithm stopped due to convergence of E₁-statistic. Valid embedding achieved ✓.
Stochastic signal, valid embedding NOT achieved ⨉.

We are not sure what could be the reason for this. Is it that our trace is too noisy, or that our system exhibits essentially random behavior?

We are new to the analysis of nonlinear systems, therefore any links to relevant literature / material would be appreciated. Thank you!

points.txt

George Datseris · Answer 1 · Tue Apr 13 2021 07:41:17 GMT+0800 (China Standard Time)

There are duplicate datapoints in your data, which typically happens in rounding that sensor output does. Try doing precicely what the error message says: add some small random noise to each datapoint. @hkraemer the error message is confusing with this "You might use model-data". Why don't we just say precisely what happens "THere are duplicate datapoints in your data" instead?

For the optimal_traditional_de, there is nothing more I can tell you, the method says that it detected your signal to be similar to stochastic noise. You should analyze the output of delay_afnn directly.

Kai Hauke Krämer · Answer 2 · Tue Apr 13 2021 15:53:45 GMT+0800 (China Standard Time)

Yepp, you are right. I was initially "designing" this error message, because I had these issues only with artificial data, like @xrisk . I'll make a PR.

@xrisk , I looked at the data and it looks indeed very noisy. The auto-mutual information indicates very very low deterministic structure. This is also why optimal_traditional_de raises the "stochastic"-alert. If I do

data = readdlm("points.txt")
data = vec(data)

data += 0.00000000001*randn(length(data))
theiler = estimate_delay(data,"mi_min")

Y, τ_pec, ts_pec, L, _ = pecuzal_embedding(data; τs = 0:100, w = theiler)

then PECUZAL executes, but -- as expected --, returns the single vector wihtout any embedding. - Because it seems to be too stochastic. May I ask from what kind of model your data is stemming? Maybe increasing the sampling time would help to "smooth" it?

Rishav Kundu · Answer 3 · Wed Apr 14 2021 21:57:38 GMT+0800 (China Standard Time)

@hkraemer thank you for your explanation. We are just doing some experimentation with implementations of some computer algorithms. We guessed that they may exhibit dynamical nature, however it seems either this is not true; or we are not capturing the correct metric from the system (or maybe there is just too much noise in our computer).

The auto-mutual information indicates very very low deterministic structure.

Can you please tell us how you measure this?

Anyway, thanks a lot for your help and for this great library 😄

Kai Hauke Krämer · Answer 4 · Wed Apr 14 2021 22:06:00 GMT+0800 (China Standard Time)

Hey @xrisk ,
you can measure the auto-mutualinformation with the method selfmutualinfo(). Simply type ? selfmutualinfo in your REPL for more information. It is a convienent measure to estimate the decorrelation time by the first local minimum of the auto-mutualinformation (there are other "methods", however, e.g. the first zero-crossing of the auto-correlation), therefore DelayEmbeddings.jl has the function estimate_delay(), which automatically gives you this estimate. You can then play around with it using different methods, e.g. first zero-crossing of AC or so (type ? estimate_delay in your REPL to see what is going on).