DeepLearning-Tutorials

This is a deep learning tutorial!!! More state-of-the-art papers and methods will be updated.

Book List

Chinese Book

《机器学习实战》--Peter Harrington
《机器学习》--周志华
《统计学习方法》--李航
《神经网络与深度学习》--邱锡鹏.link
《深度学习》--Ian GoodFellow, Yoshua Bengio et al. link

English Book

《Deep Learning》--Ian GoodFellow, Yoshua Bengio et al
《Machine Learning Yearning》-- Andrew Ng
《Pattern Recognition and Machine Learning》--Christopher M. Bishop
- book:
- https://pan.baidu.com/s/1skRgcjF : enter code:cquc
- codes
- https://github.com//ctgk/PRML
《Reinforcement Learning: An Introduction》--Richard Sutton
- book:
- https://pan.baidu.com/s/1miP38tM
- codes:
- https://github.com/ShangtongZhang/reinforcement-learning-an-introduction
- course materials:
- http://incompleteideas.net/sutton/book/the-book-2nd.html

Courses List

Paper List

Application

Computer Vision

Image Revolution

[0] LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998d). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.(LeNet-5):star::star::star::star::star:

[1] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. (AlexNet, Deep Learning Breakthrough) ⭐⭐⭐⭐⭐

[2] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).(VGGNet,Neural Networks become very deep!) ⭐⭐⭐

[3] Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.(GoogLeNet) ⭐⭐⭐

[4] He, Kaiming, et al. "Deep residual learning for image recognition." arXiv preprint arXiv:1512.03385 (2015).(ResNet,Very very deep networks, CVPR best paper) ⭐⭐⭐⭐⭐

Object Detection

[0] Evan Shelhamer, Jonathan Long, Trevor Darrell:Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (2017).FCN

[1] Ross B. Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik:Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. CVPR 2014.RCNN

[2] Ross Girshick, Redmond.Fast R-CNN: Fast Region-based Convolutional Networks for object detection. ICCV 2015.Fast RCNN

[3] Shaoqing Ren, Kaiming He, Ross B. Girshick, Jian Sun:Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. NIPS 2015.Faster RCNN

[4] Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross B. Girshick:Mask R-CNN. CVPR (2017).Mask RCNN

[5] Object Detection Summary

Semantic Segmentation

[0] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation.” in CVPR, 2015.:star::star::star::star::star:

[1] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. "Semantic image segmentation with deep convolutional nets and fully connected crfs." In ICLR, 2015.:star::star::star::star::star:

[2] Pinheiro, P.O., Collobert, R., Dollar, P. "Learning to segment object candidates." In: NIPS. 2015.

[3] Dai, J., He, K., Sun, J. "Instance-aware semantic segmentation via multi-task network cascades." in CVPR. 2016

[4] Dai, J., He, K., Sun, J. "Instance-sensitive Fully Convolutional Networks." arXiv preprint arXiv:1603.08678(2016).

[5] Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille:Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. CoRR abs/1412.7062 (2014). deeplab1, deeplab1_ppt

[6] Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille:DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. CoRR abs/1606.00915 (2016). deeplab2

[7] Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille: DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4): 834-848 (2018). deeplab3

Super-Resolution

[0] Dong, Chao, et al. "Image super-resolution using deep convolutional networks." IEEE transactions on pattern analysis and machine intelligence 38.2 (2016): 295-307.SRCN

[1] Kim, Jiwon, Jung Kwon Lee, and Kyoung Mu Lee. "Deeply-recursive convolutional network for image super-resolution." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 2016.DRCN

[2] Shi, Wenzhe, et al. "Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 2016.ESPCN

[3] Caballero, Jose, et al. "Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation." arXiv preprint arXiv:1611.05250 (2016).VESPCN

[4] Ledig, Christian, et al. "Photo-realistic single image super-resolution using a generative adversarial network." arXiv preprint arXiv:1609.04802 (2016).SRGAN

[5] Tero Karras, Timo Aila, Samuli Laine, Jaakko Lehtinen:Progressive Growing of GANs for Improved Quality, Stability, and Variation. CoRR abs/1710.10196 (2017).PGGAN

[6] Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, Bryan Catanzaro:High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. CoRR abs/1711.11585 (2017).Pix2PixHD

[7] Haris M, Shakhnarovich G, Ukita N. Deep Back-Projection Networks For Super-Resolution[J]. arXiv preprint arXiv:1803.02735, 2018.DBPN supplementary material

Deep Learning in SLAM

Depth and Pose

[0] Keisuke Tateno, Federico Tombari, Iro Laina, Nassir Navab:CNN-SLAM: Real-time dense monocular SLAM with learned depth prediction. cvpr(2017):star::star::star::star::star:

[1] Vikram Mohanty, Shubh Agrawal, Shaswat Datta, Arna Ghosh, Vishnu Dutt Sharma, Debashish Chakravarty:DeepVO: A Deep Learning approach for Monocular Visual Odometry. CoRR abs/1611.06069 (2016)

[2] Sen Wang, Ronald Clark, Hongkai Wen, Niki Trigoni:DeepVO: Towards end-to-end visual odometry with deep Recurrent Convolutional Neural Networks. ICRA 2017: 2043-2050

[3] Benjamin Ummenhofer, Huizhong Zhou, Jonas Uhrig, Nikolaus Mayer, Eddy Ilg, Alexey Dosovitskiy, Thomas Brox:DeMoN: Depth and Motion Network for Learning Monocular Stereo. CoRR abs/1612.02401 (2016)

[4] Florian Walch, Caner Hazirbas, Laura Leal-Taixé, Torsten Sattler, Sebastian Hilsenbeck, Daniel Cremers:Image-based Localization with Spatial LSTMs. CoRR abs/1611.07890 (2016)

[5] Alex Kendall, Roberto Cipolla:Geometric loss functions for camera pose regression with deep learning. CoRR abs/1704.00390 (2017)

[6] Kishore Reddy Konda, Roland Memisevic:Learning Visual Odometry with a Convolutional Network. VISAPP (1) 2015: 486-490

[7] Yevhen Kuznietsov, Jörg Stückler, Bastian Leibe:Semi-Supervised Deep Learning for Monocular Depth Map Prediction. CoRR abs/1702.02706 (2017):star::star::star::star::star:

[8] Ruihao Li, Sen Wang, Zhiqiang Long, Dongbing Gu:UnDeepVO: Monocular Visual Odometry through Unsupervised Deep Learning. CoRR abs/1709.06841 (2017)

[9] Kishore Reddy Konda, Roland Memisevic:Unsupervised learning of depth and motion. CoRR abs/1312.3429 (2013)

[10] Tinghui Zhou, Matthew Brown, Noah Snavely, David G. Lowe:Unsupervised Learning of Depth and Ego-Motion from Video. CoRR abs/1704.07813 (2017):star::star::star::star::star:

[11] Clément Godard, Oisin Mac Aodha, Gabriel J. Brostow:Unsupervised Monocular Depth Estimation with Left-Right Consistency. CoRR abs/1609.03677 (2016):star::star::star::star::star:

Optical Flow

[12] Slow Flow: Exploiting High-Speed Cameras for Accurate and Diverse Optical Flow Reference Data

[13] Anurag Ranjan, Michael J. Black:Optical Flow Estimation using a Spatial Pyramid Network. CoRR abs/1611.00850 (2016)

[14] Eddy Ilg, Nikolaus Mayer, Tonmoy Saikia, Margret Keuper, Alexey Dosovitskiy, Thomas Brox:FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks. CoRR abs/1612.01925 (2016):star::star::star::star::star:

Future of SLAM

[0] The Future of Real-Time SLAM and Deep Learning vs SLAM.SLAM

Other state-of-the-art Paper

[0] Dan C. Ciresan, Ueli Meier, Jonathan Masci, Luca Maria Gambardella, Jürgen Schmidhuber:High-Performance Neural Networks for Visual Object Classification. CoRR abs/1102.0183 (2011)

[1] T Miyato, S Maeda, M Koyama, K Nakae, S Ishii:Distributional Smoothing With Virtual Adversarial Training. CS(2015)

[2] Sara Sabour, Nicholas Frosst, Geoffrey E. Hinton:Dynamic Routing Between Capsules. NIPS (2017):star::star::star::star::star:

[3] Tero Karras, Timo Aila, Samuli Laine, Jaakko Lehtinen:Progressive Growing of GANs for Improved Quality, Stability, and Variation. ICLR(2018)

[4] Takeru Miyato, Shin-ichi Maeda, Masanori Koyama, Shin Ishii:Virtual Adversarial Training: a Regularization Method for Supervised and Semi-supervised Learning. CoRR abs/1704.03976 (2017)

Report of Computer Vision

[0] A Year in Computer Vision. cv

Natural Language Processing

Speech Recognization

Reinforcement Learning

Transfer Learning

Tutorial

[0] 迁移学习简明手册. link

Model

Unsupervised Model

[0] Le, Quoc V. "Building high-level features using large scale unsupervised learning." 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013.(Milestone, Andrew Ng, Google Brain Project, Cat)

[1] Kingma, Diederik P., and Max Welling. "Auto-encoding variational bayes." arXiv preprint arXiv:1312.6114 (2013).(VAE) ⭐⭐⭐⭐⭐

[2] Goodfellow, Ian, et al. "Generative adversarial nets." Advances in Neural Information Processing Systems. 2014.(GAN,super cool idea) ⭐⭐⭐⭐⭐

[3] Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015).(DCGAN) ⭐⭐⭐⭐⭐

[4] Gregor, Karol, et al. "DRAW: A recurrent neural network for image generation." arXiv preprint arXiv:1502.04623 (2015). [pdf] (VAE with attention, outstanding work) ⭐⭐⭐⭐⭐

[5] Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016). (PixelRNN)

[6] Oord, Aaron van den, et al. "Conditional image generation with PixelCNN decoders." arXiv preprint arXiv:1606.05328 (2016).

[7] Aäron van den Oord, Nal Kalchbrenner, Lasse Espeholt, Koray Kavukcuoglu, Oriol Vinyals, Alex Graves: Conditional Image Generation with PixelCNN Decoders. NIPS 2016: 4790-4798.pixelCNN

[8] Tim Salimans, Andrej Karpathy, Xi Chen, Diederik P. Kingma: PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications. CoRR abs/1701.05517 (2017).PixelCNN++

RNN LSTM GRU etc.

[0] Graves, Alex. "Generating sequences with recurrent neural networks." arXiv preprint arXiv:1308.0850 (2013).(LSTM, very nice generating result, show the power of RNN)

[1] Cho, Kyunghyun, et al. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." arXiv preprint arXiv:1406.1078 (2014).(First Seq-to-Seq Paper)

[2] Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." Advances in neural information processing systems. 2014.(Outstanding Work) ⭐⭐⭐⭐⭐

[3] Bahdanau, Dzmitry, KyungHyun Cho, and Yoshua Bengio. "Neural Machine Translation by Jointly Learning to Align and Translate." arXiv preprint arXiv:1409.0473 (2014).

[4] Vinyals, Oriol, and Quoc Le. "A neural conversational model." arXiv preprint arXiv:1506.05869 (2015).(Seq-to-Seq on Chatbot)

[5] Understanding LSTM Networks ⭐⭐⭐⭐⭐

CNN(Convolutional Neural Networks)

[0] Dilated Convolutional Kernel - Fisher Yu, Vladlen Koltun:Multi-Scale Context Aggregation by Dilated Convolutions. ICLR(2016)

[1] Deformable Convolutional Kernel - Jifeng Dai, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, Yichen Wei:Deformable Convolutional Networks. CoRR abs/1703.06211 (2017)

[2] Convolution Operations. link

[3] Convolution Analyzer. link

[4] What Do We Understand About Convolutional Networks? link

Lightly Convolution Neural Networks

[0] Forrest N. Iandola, Matthew W. Moskewicz, Khalid Ashraf, Song Han, William J. Dally, Kurt Keutzer:SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size. CoRR abs/1602.07360 (2016). SqueezeNet

[1] Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam:MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. CoRR abs/1704.04861 (2017). MobileNets

[2] Mark Sandler, Andrew G. Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen: Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation. CoRR abs/1801.04381 (2018). MobileNets_V2

[3] François Chollet:Xception: Deep Learning with Depthwise Separable Convolutions. CVPR 2017: 1800-1807. Xception

[4] Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, Jian Sun:ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. CoRR abs/1707.01083 (2017). ShuffleNet

[5] Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc V. Le: Learning Transferable Architectures for Scalable Image Recognition. CoRR abs/1707.07012 (2017). NasNet

[6] Robert J. Wang, Xiang Li, Shuang Ao, Charles X. Ling:Pelee: A Real-Time Object Detection System on Mobile Devices. CoRR abs/1804.06882 (2018). PeleeNet

Model Constraints

[0] Hinton, Geoffrey E., et al. "Improving neural networks by preventing co-adaptation of feature detectors." arXiv preprint arXiv:1207.0580 (2012). (Dropout)

[1] Srivastava, Nitish, et al. "Dropout: a simple way to prevent neural networks from overfitting." Journal of Machine Learning Research 15.1 (2014): 1929-1958.

[2] Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015).(An outstanding Work in 2015)

[3] Ba, Jimmy Lei, Jamie Ryan Kiros, and Geoffrey E. Hinton. "Layer normalization." arXiv preprint arXiv:1607.06450 (2016). (Update of Batch Normalization)

[4] Courbariaux, Matthieu, et al. "Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to+ 1 or−1." (New Model,Fast)

[5] Jaderberg, Max, et al. "Decoupled neural interfaces using synthetic gradients." arXiv preprint arXiv:1608.05343 (2016). (Innovation of Training Method,Amazing Work) ⭐⭐⭐⭐⭐

[6] Chen, Tianqi, Ian Goodfellow, and Jonathon Shlens. "Net2net: Accelerating learning via knowledge transfer." arXiv preprint arXiv:1511.05641 (2015). (Modify previously trained network to reduce training epochs)

[7] Wei, Tao, et al. "Network Morphism." arXiv preprint arXiv:1603.01670 (2016). (Modify previously trained network to reduce training epochs)

[8] Han, Song, Huizi Mao, and William J. Dally. "Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding." CoRR, abs/1510.00149 2 (2015). (ICLR best paper, new direction to make NN running fast,DeePhi Tech Startup) ⭐⭐⭐⭐⭐

[9] Iandola, Forrest N., et al. "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 1MB model size." arXiv preprint arXiv:1602.07360 (2016).(Also a new direction to optimize NN,DeePhi Tech Startup)

Optimization

Optimization Methods

[0] Sebastian Ruder:An overview of gradient descent optimization algorithms. CoRR abs/1609.04747 (2016):star::star::star::star::star:

[1] Back Propagation Algorithm

[2] Andrychowicz, Marcin, et al. "Learning to learn by gradient descent by gradient descent." arXiv preprint arXiv:1606.04474 (2016).(Neural Optimizer,Amazing Work)

Optimization Functions

Momentum
Nesterov accelerated gradient
Adagrad
Adadelta
RMSprop
Adam
AdaMax
Nadam

⭐⭐⭐⭐⭐Adam is a better choice

Types of Activation

sigmoid
hard sigmoid
tanh
relu
lerelu
elu
selu
prelu
maxout
swish
softplus
softshrink
softsign
tanhshrink
softmin
softmax
logsoftmax
softmax2d
etc.

relu, lerelu, tanh, sigmoid is recommanded strongly!!!

Journals and Periardical

Machine Learning and Theories

NIPS
ICML
ICLR

Computer Vision

CVPR
ICCV
ECCV

Neural Language Processing

EMNLP
ACL

Artifical Intelligence

AAAI
IJCAI

Public Accounts

机器之心
新智元

Deep Learning Framework(open source framework)

Tensorflow
- Tensorflow Tutorial Summary
- Learning codes:https://github.com/MorvanZhou/Tensorflow-Tutorial
- tensorflow slim:
- tensorflow modules:
  - https://github.com/Sarasra/models
- tensorflow pre-train models
  - https://github.com/Sarasra/models/tree/master/research/slim
- tensorflow model zoo
Caffe
Pytorch
- Learning codes:https://github.com/hunkim/pytorch-tutorial
- blog:https://zhuanlan.zhihu.com/p/30123806
- pytorch summary: https://github.com/sksq96/pytorch-summary
Keras
Mxnet
etc.

New Architecture

Convolution Neural Networks
Recurrent Neural Networks
Generative Adversarial Networks
Capsules(Dynamic Routing Between Capsules--by Hinton)

blog

video

official codes

- DenseNet：Densely Connected Convolutional Networks. DenseNet

- DiracNets: Training Very Deep Neural Networks Without Skip-Connections. DiracNet

Non-local Neural Networks. Non-Local Nets
Convolutional Neural Networks with Alternately Updated Clique. CliqueNet

Other Sources

Generative Adversarial Networks:(GAN):

GAN Paper
GAN Tricks
GAN Tutorial 2018CVPR
From GAN to WGAN
GAN Codes

Tensorflow_1

Tensorflow_2

Pytorch
GAN Performance Report
GAN video
10 papers for GAN(strongly recommend)
- Progressive Growing of GANs for Improved Quality, Stability, and Variation
- Spectral Normalization for Generative Adversarial Networks
- cGANs with Projection Discriminator
- High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
- Are GANs Created Equal? A Large-Scale Study
- Improved Training of Wasserstein GANs
- StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
- Privacy-preserving generative deep neural networks support clinical data sharing
- Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks
- Gradient descent GAN optimization is locally stable
Something interesting about GAN

(1) cycle-gan
- blog
- code
(2) progressive-grow gan
- paper
- code

Deep Architecture Genealogy

deep_architecture_genealogy:https://github.com/hunkim/deep_architecture_genealogy
coggle link:https://coggle.it/diagram/Wf5mYoJbsgABUF9P

DeepLearning-Tutorials

Book List

Chinese Book

English Book

Courses List

Paper List

Application

Computer Vision

Image Revolution

Object Detection

Semantic Segmentation

Super-Resolution

Deep Learning in SLAM

Depth and Pose

Optical Flow

Future of SLAM

Other state-of-the-art Paper

Report of Computer Vision

Natural Language Processing

Speech Recognization

Reinforcement Learning

Transfer Learning

Tutorial

Model

Unsupervised Model

RNN LSTM GRU etc.

CNN(Convolutional Neural Networks)

Lightly Convolution Neural Networks

Model Constraints

Optimization

Optimization Methods

Optimization Functions

Types of Activation

Journals and Periardical

Public Accounts

Deep Learning Framework(open source framework)

New Architecture

Other Sources

Generative Adversarial Networks:(GAN):

Deep Architecture Genealogy

Python Resources

Datasets

Computer Vision

Classification

Geometry and SLAM

Object Detection

Face Datasets

Dehazing Datasets

Deraining Datasets

References

Contacts

About