RNN LSTM layer
bhack opened this issue · comments
@jeffdonahue @sguada Do you have code to share for your pubblication: http://arxiv.org/abs/1411.4389?
Thanks for your interest in our work! We definitely plan to release code, but there's quite a bit of work to do to get it into a reasonably sane state -- there will be PRs once it's ready.
@jeffdonahue Could you provide some comments if we would like to integrate caffe with the LSTM code from karpathy? Cheers.
@bittnt Do you mean @karpathy at https://github.com/karpathy/neuraltalk?
@bhack Yes. I think someone has already hacked it together. :))
Should see this: https://github.com/dophist/kaldi-lstm (holds CUDA code too, i think is the best one).
https://github.com/junhyukoh/caffe-lstm nice work from u-mich.
@junhyukoh Do you have a plan to contribute back to caffe with a PR?
@junhyukoh looking forward to your merge :D
@bhack @sunbaigui Thank you for your interest!
But, I think my current implementation does not perfectly fit to Caffe.
Since I'm using a mini-batch as a training sequence (an unrolled RNN), my code supports only SGD not the mini-batch update (one update after processing several training sequences).
I plan to rewrite the code and do PR when it's ready, but I cannot guarantee the timeline.
Feel free to use my code and any comments are welcome!
Closing, see #1873 for (a cleaned up version of) the implementation we used for LRCN.