Multi-layer LSTMs broken

Question

Multi-layer LSTMs broken

kylebgorman opened this issue a year ago · comments

Enabling either --encoder_layers 2 or --decoder_layers 2 will cause runtime crashes during training. All of the following seem to have issues: LSTM, attentive LSTM, pointer-generator, transducer.

Expected hidden[0] size (2, 64, 100), got [1, 64, 100]

RuntimeError: Expected hidden[0] size (1, 40, 100), got [2, 40, 100]

etc. I have tagged this a release blocker.

Adam · Answer 1 · Tue Feb 28 2023 02:27:42 GMT+0800 (China Standard Time)

Ok, weird. I thought This worked in this codebase. It should be a fairly easy fix, I just have to the last layer to the decoder at each step. Should be able to do this ~Wednesday.

Kyle Gorman · Answer 2 · Thu Mar 09 2023 08:45:53 GMT+0800 (China Standard Time)

This is closed in #57, as far as I know.