Conjugate Gradient with Rectified Linear Unit
wichtounet opened this issue · comments
Hi,
First, thanks a lot for the code of this project, it helped me a lot implementing a working RBM and DBN.
The CG implementation works quite well with sigmoid hidden units, but does not seem to work on NRLU hidden unit (max(0, x) as activation probability function and max(0, x + N(0, Sigmoid(x)) for sampling). Do you have an idea on what should be changed to make it work on these units ?
Thanks
I think you'll need to modify the gradient() function
Nevermind, I made several mistakes. The condition at https://github.com/jdeng/rbm-mnist/blob/master/src/rbm.h#L581-582 is indeed very important. The greedy layer-wise pre-tuning needs to be configured differently, the learning rate and weight decay needs to specially tuned for nRLU unit. Now, although it does converge slower than stochastic binary hidden unit, it seems to work.