jdeng / rbm-mnist

C++ 11 implementation of Geoff Hinton's Deep Learning matlab code

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Conjugate Gradient with Rectified Linear Unit

wichtounet opened this issue · comments

Hi,

First, thanks a lot for the code of this project, it helped me a lot implementing a working RBM and DBN.

The CG implementation works quite well with sigmoid hidden units, but does not seem to work on NRLU hidden unit (max(0, x) as activation probability function and max(0, x + N(0, Sigmoid(x)) for sampling). Do you have an idea on what should be changed to make it work on these units ?

Thanks

commented

I think you'll need to modify the gradient() function

Nevermind, I made several mistakes. The condition at https://github.com/jdeng/rbm-mnist/blob/master/src/rbm.h#L581-582 is indeed very important. The greedy layer-wise pre-tuning needs to be configured differently, the learning rate and weight decay needs to specially tuned for nRLU unit. Now, although it does converge slower than stochastic binary hidden unit, it seems to work.