garretthoffman / lstm-oreilly

How to build a Multilayered LSTM Network to infer Stock Market sentiment from social conversation using TensorFlow.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Initial state in model

Nicolabo opened this issue · comments

Hi Garett,
Great article and github repo. I have a question regarding initial state in your LSTM model. During training you initialised a state for each epoch, but within epoch you implement stateful LSTM to transfer initial.state from batch to batch.

state = sess.run(initial_state)
            
train_acc = []
for ii, (x, y) in enumerate(utl.get_batches(train_x, train_y, batch_size), 1):
     feed = {inputs_: x,
                  labels_: y[:, None],
                  keep_prob_: keep_prob,
                  initial_state: state}
     loss_, state, _,  batch_acc = sess.run([loss, final_state, optimizer, accuracy], feed_dict=feed)

But in your graph your initial.state is not a placeholder.

initial_state = cell.zero_state(batch_size, tf.float32)

My understanding is that it's not gonna work as you expect. It will always be zero.state. Please correct me if I am wrong.

Hi @Nicolabo!

Anytf.placeholder must be passed into your network, but any variable in your DAG can actually be passed through if you choose to. Check out the example in the Google Colab notebook linked below and let me know if you have any other questions!

https://colab.research.google.com/drive/1JSpCLxmYuAPslH4Ixt12RvLzLYh8sPKR

Thanks, I didn't know that!
One more thing when it comes to initial.state. During prediction, you also divided data into batches and update state from batch to batch. I am not quite sure why is that. It seems some predicted values have impact on other predicted values. Am I wrong?

Hmm, that's a really good point and something I may have overlooked when putting this together. Let me look into this and get back to you.