Batch Normalized Recurrent Neural Networks
Theano code for the Penn Treebank language model experiments in the paper Batch Normalized Recurrent Neural Networks. The baseline LSTMs reproduce the results from the paper Recurrent Neural Network Regularization.
The name of the repo is, of course, based off of Karpathy's char-rnn.
Requirements
Theano is required for running the experiments:
pip install Theano
Plotly is optional and is used to generate plots after training:
pip install plotly
Experiments
To run the small reference model on CPU:
python experiments/ptb_small_ref.py
To run the small normalized model on GPU:
THEANO_FLAGS=device=gpu python experiments/ptb_small_norm.py
References
-
Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift.
-
Zaremba, W., Sutskever, I., & Vinyals, O. (2014). Recurrent neural network regularization.