khundman / telemanom

A framework for using LSTMs to detect anomalies in multivariate time series data. Includes spacecraft anomaly data and experiments from the Mars Science Laboratory and SMAP missions.

Home Page:https://arxiv.org/abs/1802.04431

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Does the data only have categorical inputs?

francisduan opened this issue · comments

I have noticed that most of the input entries are zero, and I am wondering if the attached dataset has any numerical inputs (sensor readings, etc.)? If there is, are you using them with one-hot encodings directly?

Yes the telemetry channel's recent history is included. See the 'Data' section in the Readme for more details.

Hi thanks for the reply. But my question was whether you are using numerical values and one-hot encoding values together?

Does it make sense to separate the two? Since numerical values might have very different latent information compared with categorical values

Yes they are used together, this is shown here:

image

There's more discussion in my response to Q3 in #8 as well as in the paper.

It definitely could make sense to separate the two.

Thanks a lot for this, sorry I did not realize the original y is also an input.

So the algorithm is transforming a Multivariate time series (MTS) into a univariate one (output y hat), with the target y included in the MTS as well. Is this correct?

This is explained in the "Methods" section in the paper:

image

Prior values of y are inputs for the prediction of the current y value.

ok thank you