Conjugate Gradient with Rectified Linear Unit

Question

Conjugate Gradient with Rectified Linear Unit

wichtounet opened this issue 10 years ago · comments

Hi,

First, thanks a lot for the code of this project, it helped me a lot implementing a working RBM and DBN.

The CG implementation works quite well with sigmoid hidden units, but does not seem to work on NRLU hidden unit (max(0, x) as activation probability function and max(0, x + N(0, Sigmoid(x)) for sampling). Do you have an idea on what should be changed to make it work on these units ?

Thanks

Jack · Answer 1 · Tue Jun 17 2014 14:29:29 GMT+0800 (China Standard Time)

I think you'll need to modify the gradient() function

Baptiste Wicht · Answer 2 · Tue Jun 17 2014 22:43:16 GMT+0800 (China Standard Time)

Nevermind, I made several mistakes. The condition at https://github.com/jdeng/rbm-mnist/blob/master/src/rbm.h#L581-582 is indeed very important. The greedy layer-wise pre-tuning needs to be configured differently, the learning rate and weight decay needs to specially tuned for nRLU unit. Now, although it does converge slower than stochastic binary hidden unit, it seems to work.