CUNY-CL / yoyodyne

Small-vocabulary sequence-to-sequence generation with optional feature conditioning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Multi-layer LSTMs broken

kylebgorman opened this issue · comments

Enabling either --encoder_layers 2 or --decoder_layers 2 will cause runtime crashes during training. All of the following seem to have issues: LSTM, attentive LSTM, pointer-generator, transducer.

Expected hidden[0] size (2, 64, 100), got [1, 64, 100]
RuntimeError: Expected hidden[0] size (1, 40, 100), got [2, 40, 100]

etc. I have tagged this a release blocker.

commented

Ok, weird. I thought This worked in this codebase. It should be a fairly easy fix, I just have to the last layer to the decoder at each step. Should be able to do this ~Wednesday.

This is closed in #57, as far as I know.