dannyneil / public_plstm

Phased LSTM

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question: How to use Phased LSTM for regression data?

philipperemy opened this issue · comments

Hey Danny,

I could send you a mail directly but I guess it's better to keep track here so that other people can have a look.

In the Phased LSTM paper, you discussed mostly about classification problems.

Do you have any ideas how Phased LSTM could be used for a regression problem with asynchronous data?

Let's say, I have a sensor that sends data asynchronously. I would like to be able to predict the next data point. But its value depends on the time of arrival. It's not the same if it comes 1 second after or 10 minutes after.

When it's synchronous data, it's quite easy because you can assume that data points are spanned every minute. In this case, it would just be forecasting at t+1. But when it comes to asynchronous data, I guess it's more tricky as the next point could come 15 seconds or 25 seconds later. So if we give the next data point to predict, the model will not have the information of when this data point actually arrived.

Toy example is:

data_point_1 {value = 0.02, timestamp = 0010}
data_point_2 {value = 0.04, timestamp = 0023}
data_point_3 {value = 0.01, timestamp = 0035}
data_point_4 {value = -0.02, timestamp = 0060}
data_point_5 {value = 0.04, timestamp = 0076}
data_point_6 {value = 0.09, timestamp = 0078}
data_point_7 {value = 0.03, timestamp = 0090}
data_point_8 {value = 0.01, timestamp = 00101}
data_point_9 {value = 0.02, timestamp = 00102}

Let's predict:
data_point_10 {value = 0.05, timestamp = 00106}

We can give data_point_1 up to data_point_9 to the network, along with their timestamps, as inputs. The network can figure out the frequencies and phases of the signals (strength of Phased LSTM!).

But how do we give the target? If we just give data_point_10.value, 0.05, it does not mean much since the timestamp is omitted. I guess we want to give the timestamp too.

For inference we would just query the model with {data_point_1 to data_point_9} and data_point_10.query_timestamp = 00106 (or possibly data_point_9.timestamp + forecasting_time_ahead in the general case), and hope to match data_point_10.value.

Am I correct? How could I improve my thinking?

Thanks!

I'm not well-versed in the particulars of the phased LSTM but for RNNs in general, I think you have two options.

As it sounds like you know you shouldn't care about absolute time, but rather the duration between events, you could create a delta time feature (e.g. data_point_2.timestamp - data_point1.timestamp, np.diff(timestamps) basically). It has the benefit of improved generalization across different sequences as absolute time is discarded.

Alternatively, if there are non-stationary effects in the process you're modelling (e.g. values are noticeably different around early timestamps compared to later timestamps or something) you'll want to try choosing a fixed sample rate and inserting zero values to match, like:

{value = 0.02, timestamp = 0010}
{value = 0.00, timestamp = 0011}
{value = 0.00, timestamp = 0012}
...
{value = 0.00, timestamp = 0021}
{value = 0.00, timestamp = 0022}
{value = 0.04, timestamp = 0023}

Then it's up to the RNN to learn the importance of durations between consecutive events.

I'm sure there's smarter stuff out there, but maybe this could be a start.

@carlthome thanks for this clarification :)