about RNN prediction

Question

about RNN prediction

Guocode opened this issue 5 years ago · comments

why policy network uses state[0] to be input but not the whole state? It is difficulty to understand that policy network predicts the whole network architecture by only the first layer, I think at least it should use state[-1](the last layer of previous state) to predict the first layer of the next state.

Somshubra Majumdar · Answer 1 · Sun Feb 24 2019 23:17:52 GMT+0800 (China Standard Time)

Hmm. When I was implementing this, many of the details were not a available in the paper so I had to come up with reasonable defaults. Ofc, those may have been wrong.

You may try using either first or the last state. I chose the first state as the output of the first rnn step is chained to as the next input, so it made logical sense to have state[0] as initial input.

If you try state[-1] and it works, I would be glad to change it.

I must say, the progressive nas version is much more stable than this rl codebase

Guocode · Answer 2 · Tue Feb 26 2019 17:27:06 GMT+0800 (China Standard Time)

Thanks very much for your reply and the implement. In fact I'm not so familiar with RNN but I think it's more reasonable to predict the next state by the whole previous state, as if you want to predict a sentence someone will say then the previous sentence should be considered but not the last word in the previous sentence.

Somshubra Majumdar · Answer 3 · Tue Feb 26 2019 22:28:12 GMT+0800 (China Standard Time)

I don't really understand by what you mean as the "whole" state. Keras, and an RNN in general, will only accept a state size of shape (batch size, state size) * number of hidden states as input.

If by "whole state" your mean all timesteps of the state (batch size, timesteps, state size) * number of hidden states, it is not possible to do so.

That is why I chain the last timestep of this state vector to the input of the next rnn call. This is standard practice in Machine Translation and stateful RNN prediction.

Guocode · Answer 4 · Wed Feb 27 2019 10:07:11 GMT+0800 (China Standard Time)

Let's take a easy example, a current_state is like [(filter)16, (kernel)3, (filter)32, (kernel)(5)],encoded as [0,0,1,1] which may represent a two layers network.Then to predict the next state, in your code RNN will only take the (kernel)(5) to be input, and then predict the encoded next_state like [1,1,0,1] maybe, I understand that RNN will successively output next_state[0] by kernel(5), then output next_state[1] by next_state[0] and so on.
What I mean is that can it be like this, RNN takes the current_state[0], drops the output, current_state[1], drop again, current_state[2], drop again, when it comes to current_state [3], the later output is taken to be next_state. Under this assumption RNN has gone through the 'whole' current_state and 'remember' the 'whole' of that so that its prediction is made by the complete knowledge of current one.

Somshubra Majumdar · Answer 5 · Wed Feb 27 2019 22:47:28 GMT+0800 (China Standard Time)

Yes that's a way of priming the rnn states for prediction and can be done. I haven't implemented that, but if you so wish, you can send a PR and I'll review it.