Multi-layer LSTMs broken
kylebgorman opened this issue · comments
Enabling either --encoder_layers 2
or --decoder_layers 2
will cause runtime crashes during training. All of the following seem to have issues: LSTM, attentive LSTM, pointer-generator, transducer.
Expected hidden[0] size (2, 64, 100), got [1, 64, 100]
RuntimeError: Expected hidden[0] size (1, 40, 100), got [2, 40, 100]
etc. I have tagged this a release blocker.
Ok, weird. I thought This worked in this codebase. It should be a fairly easy fix, I just have to the last layer to the decoder at each step. Should be able to do this ~Wednesday.
This is closed in #57, as far as I know.