fff-rs / juice

The Hacker's Machine Learning Engine

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Implement dropout

drahnr opened this issue · comments

Should be pretty straight forward, warmup for #10

  • expand the cudnn bindings in rcudnn
  • use the rcudnn bindings in coaster-nn
  • create a apropriate interface in coaster
  • use that interface to define a layer in juice
  • implement tests

Paper: http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf

Why backprop of it commented out?

If I read the paper correctly, the backpropagation is just a unit factor which can be skipped.
I am on my phone so I cannot review the code right now, the backprop will skip all non existent elements during backprop which enables a good speedup IIRC.

Actually that is incorrect, backprop should only propagate back on the thinned network (section 5.1 of the linked paper) so unless the weights are zero, backprop may not be skipped

Reviewing the paper, the thinned paper essentially is setting the gradient to zero which is easily done.
The normalization should be realized by means of an additional mechanic or variation parameter which can be introduced in a separate PR.