yobibyte / torch-lsuv

Torch implementation of LSUV weight init from http://arxiv.org/abs/1511.06422

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

torch-lsuv

This is an attempt to reproduce the experiments from arXiv:1511.06422. Blog post with some details here.

How to use

MNIST

th mnist-example.lua --lsuv

CIFAR-10

code taken from here, thanks to @szagoruyko)

Download and preprocess data first:

cd cifar.torch
OMP_NUM_THREADS=2 th -i provider.lua
provider = Provider()
provider:normalize()
torch.save('provider.t7',provider)

Train with lsuv:

CUDA_VISIBLE_DEVICES=0 th train.lua --model vgg_bn_drop -s logs/vgg --lsuv

Results

test accuracy for MNIST (WARNING! nolsuv case is without BN)

I used this only to check that training works. (with -f key for training on full dataset)

epoch with lsuv (lr=0.1) with lsuv (lr=0.05) without lsuv (lr=0.001) with lsuv (lr=0.001)
1 97.77% 96.69% 83.39% 78.28%
2 98.45% 97.94% 89.25% 87.75%
3 98.63% 98.37% 91.23% 91.19%
4 98.74% 98.57 92.46% 92.82%
5 98.88% 98.72% 93.23% 93.81%
6 98.97% 98.75% 93.88% 94.53%
7 99.03% 98.86% 94.44% 95.06%
8 99.01% 98.86% 94.81% 95.4%
9 99.01% 98.9% 95.03% 95.87%
10 98.96% 98.91 95.29% 96.15%

Training without LSUV with learning rates 0.05 and 0.01 did not converge after 10 epochs (the accuracy was the same 11.35% along 10 epochs). This is because I used nolsuv case without BN.

Test accuracy for CIFAR-10

References

Thanks for debugging and help to @ikostrikov

About

Torch implementation of LSUV weight init from http://arxiv.org/abs/1511.06422


Languages

Language:Lua 100.0%