Initial state in model

Question

Initial state in model

Nicolabo opened this issue 6 years ago · comments

Hi Garett,
Great article and github repo. I have a question regarding initial state in your LSTM model. During training you initialised a state for each epoch, but within epoch you implement stateful LSTM to transfer initial.state from batch to batch.

state = sess.run(initial_state)
            
train_acc = []
for ii, (x, y) in enumerate(utl.get_batches(train_x, train_y, batch_size), 1):
     feed = {inputs_: x,
                  labels_: y[:, None],
                  keep_prob_: keep_prob,
                  initial_state: state}
     loss_, state, _,  batch_acc = sess.run([loss, final_state, optimizer, accuracy], feed_dict=feed)

But in your graph your initial.state is not a placeholder.

initial_state = cell.zero_state(batch_size, tf.float32)

My understanding is that it's not gonna work as you expect. It will always be zero.state. Please correct me if I am wrong.

Garrett Hoffman · Answer 1 · Tue Oct 30 2018 23:30:49 GMT+0800 (China Standard Time)

Hi @Nicolabo!

Anytf.placeholder must be passed into your network, but any variable in your DAG can actually be passed through if you choose to. Check out the example in the Google Colab notebook linked below and let me know if you have any other questions!

https://colab.research.google.com/drive/1JSpCLxmYuAPslH4Ixt12RvLzLYh8sPKR

Nicolabo · Answer 2 · Wed Oct 31 2018 05:31:26 GMT+0800 (China Standard Time)

Thanks, I didn't know that!
One more thing when it comes to initial.state. During prediction, you also divided data into batches and update state from batch to batch. I am not quite sure why is that. It seems some predicted values have impact on other predicted values. Am I wrong?

Garrett Hoffman · Answer 3 · Wed Oct 31 2018 23:50:03 GMT+0800 (China Standard Time)

Hmm, that's a really good point and something I may have overlooked when putting this together. Let me look into this and get back to you.