2D LSTM ?

Question

2D LSTM ?

seragENTp opened this issue 8 years ago · comments

what is the architecture of the 2D LSTM implemented in the library , any reference for it ?

Tom · Answer 1 · Fri Oct 28 2016 05:38:33 GMT+0800 (China Standard Time)

It's a bidirectional LSTM running over the rows of the image, then over the columns. The input is usually the output from a convolutional layer.

There is a bit more info and references in this paper:

http://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Byeon_Scene_Labeling_With_2015_CVPR_paper.html

Amit D. · Answer 2 · Fri Oct 28 2016 07:16:27 GMT+0800 (China Standard Time)

Tom,
I remember that you wrote some time ago that 2D LSTM is not better than 1D LSTM for OCR of printed text. Is that still true? for all scripts?

Tom · Answer 3 · Fri Oct 28 2016 07:19:19 GMT+0800 (China Standard Time)

Getting good performance out of 1D LSTM requires a good normalizer. The Ocropus normalizer works surprisingly well for some non-Latin scripts, but we really need more benchmarks to see how far that carries over.

The normalizer is a fairly tricky piece of code, so it would be nice to be able to dispense with it. I'll be experimenting with once the basic GPU implementation is done.

Mohamed Yousef · Answer 4 · Fri Oct 28 2016 11:08:34 GMT+0800 (China Standard Time)

@tmbdev what you are describing and is implemented in CLSTM is essentially a ReNet [1] style LSTM
but what is in the paper [2] is the 2D case of the classical MDLSTM [3] {by graves et al.} with two forget gates

[1] https://arxiv.org/abs/1505.00393
[2] http://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Byeon_Scene_Labeling_With_2015_CVPR_paper.html
[3] https://arxiv.org/abs/0705.2011

Tom · Answer 5 · Fri Oct 28 2016 11:54:00 GMT+0800 (China Standard Time)

Correct, ReNet implements the same model we do, and both models are different from the original 2D LSTM. We published some additional papers in 2014, and I gave some tutorials on these kinds of multidimensional LSTMs in 2013.

morusu · Answer 6 · Wed Jan 18 2017 22:40:26 GMT+0800 (China Standard Time)

hi, where is the 2D lstm ? I cannot find the implementation , did i miss someting?

Amit D. · Answer 7 · Wed Jan 18 2017 23:43:47 GMT+0800 (China Standard Time)

https://github.com/tmbdev/clstm/blob/master/test-2d.cc

Amit D. · Answer 8 · Wed Jan 18 2017 23:47:04 GMT+0800 (China Standard Time)

... and https://github.com/tmbdev/clstm/blob/509144def09c/clstm_prefab.cc#L130