renll / RSAM

Recurrent Soft Attention Model

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Recurrent Soft Attention Model

General Model Structure:

RSAM structure for 1 timestamp

The weighted context information, i.e. the soft attention, is fed into the model through the down-sample network that consists of a 1x1 feature-map down-sampling convolutional layer in each glimpse timestamp.

The attention masked images

The soft attention masked images generated for each glimpse.

The arxiv link of the paper

Related Works

[1] J. Ba, V. Mnih, and K. Kavukcuoglu. Multiple object recognition with visual attention. arXiv preprint arXiv:1412.7755, 2014.

[2] K. He, X. Zhang, S. Ren, and J. Sun. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pages 1026–1034, 2015.

[3] K.He,X.Zhang,S.Ren,andJ.Sun. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.

[4] S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735– 1780, 1997.

[5] H. Larochelle and G. E. Hinton. Learning to combine foveal glimpses with a third-order boltzmann machine. In Advances in neural information processing systems, pages 1243–1251, 2010.

[6] V. Mnih, N. Heess, A. Graves, et al. Recurrent models of visual attention. In Advances in neural information processing systems, pages 2204–2212, 2014.

[7] J.Redmon,S.Divvala,R.Girshick,andA.Farhadi. You only look once: Unified, real-timeobject detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 779–788, 2016.

[8] K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. In International Conference on Machine Learning, pages 2048–2057, 2015

About

Recurrent Soft Attention Model

License:MIT License


Languages

Language:Python 100.0%