Remains to be updated. Not for wide circulation.
Self-Normalizing Neural Networks
Improving neural networks by preventing co-adaptation of feature detectors
Efficient Object Localization Using Convolutional Networks
Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
Language Modeling with Gated Convolutional Networks
ADADELTA: An Adaptive Learning Rate Method
Adam: A Method for Stochastic Optimization
Generating Sequences With Recurrent Neural Networks
SGDR: Stochastic Gradient Descent with Warm Restarts
Reference : http://pytorch.org/docs