Layer normalization

Question

Layer normalization

usamec opened this issue 4 years ago · comments

It would be nice to support some form of layer normalization in LSTM and GRU layer (example https://github.com/pytorch/pytorch/blob/master/benchmarks/fastrnns/custom_lstms.py#L171)

Sharvil Nanavati · Answer 1 · Tue Mar 03 2020 02:43:24 GMT+0800 (China Standard Time)

Hmm that's an interesting implementation. They're applying layer norm to in addition to . The supplementary material in Ba et al. (pp. 13–14) only applies layer norm to in both of their LSTM variants.

Do you know if there's any follow-up literature that explains the PyTorch variant?

Vlado Boza · Answer 2 · Tue Mar 03 2020 22:47:13 GMT+0800 (China Standard Time)

@sharvil I do not about any. I personally think that any variant of GRU/LSTM with LayerNorm would be great addition.

Sharvil Nanavati · Answer 3 · Thu Mar 05 2020 05:07:26 GMT+0800 (China Standard Time)

Here's what the haste.LayerNormLSTM implementation looks like:

This implementation is nearly identical to eqs. 20–22 of the layer norm paper. The differences are:

we don't apply a bias term to layer norms on the input or recurrent connection; these parameters are unnecessary since there's already a bias term (... + b) applied by the LSTM
we use instead of to denote the gain parameter (notation change)
we initialize to 1 and to 0 instead of the other way around (seems like a typo in the paper)

I haven't gotten around to updating the docs yet but haste.LSTM can just be replaced with haste.LayerNormLSTM. Zoneout, DropConnect, etc. are all supported in LayerNormLSTM as well.

Vlado Boza · Answer 4 · Tue Mar 10 2020 04:35:42 GMT+0800 (China Standard Time)

Nice! Having GRU would be also great, but we can probably manage with LSTMs :)

Sharvil Nanavati · Answer 5 · Tue Mar 10 2020 04:43:38 GMT+0800 (China Standard Time)

Our LSTM implementation is much further ahead than the GRU one so we started with LSTMs first. When we do the GRU updates, we'll keep LayerNorm in mind. Thanks for the feature request!