dd2912 / ml_papers

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

1 Introduction to Deep Learning

Text Book

  1. Bengio, Yoshua, Ian J. Goodfellow, and Aaron Courville. Deep learning. An MIT Press book. (2015). pdf

High-level Survey

  1. LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature 521.7553 (2015): 436-444.pdf ️️️️️

Courses

  1. MIT 6.S191Introduction to Deep Learning web
  2. Dive into Deep Learning web

2 Convolutional Neural Networks (CNNs)

LeNet: Image Classification on Handwritten Digits and Image Classification on ImageNet

  1. Y. LeCun, L. Bottou, Y. Bengio and P. Haffner. Gradient-Based Learning Applied to Document Recognition.  Proceedings of the IEEE, 86(11):2278-2324. 1998. pdf (Seminal Paper: LeNet)
  2. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. 2012. pdf
  3. Simonyan, Karen, and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014). pdf
  4. Szegedy, Christian, et al. Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. pdf
  5. He, Kaiming, et al. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385 (2015). pdf ResNet
  6. Huang, G. et al. Densely Connected Convolutional Networks. arXiv preprint arXiv:1608.06993 (2017) pdf (DenseNet)
  7. Hu, Jie et al.  Squeeze-and-Excitation Networks. arXiv preprint arXiv:1709.01507 (2017) pdf
  8. Howard, A. G. et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. pdf]
  9. Tan, M. and Le, Q. V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. pdf
  10. Xie, Q. et al. Self-training with Noisy Student improves ImageNet classification. pdf
  11. Bojarski, M. et al. End to End Learning for Self-Driving Cars. pdf

3 Object Detection

  1. H. A. Rowley, S. Baluja, and T. Kanade, Neural network-based face detection, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognition, pp. 203–208, 1996. pdf
  2. P. Viola and M. Jones, Rapid object detection using a boosted cascade of simple features, in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. pdf
  3. Szegedy, Christian, Alexander Toshev, and Dumitru Erhan. Deep neural networks for object detection. Advances in Neural Information Processing Systems. 2013. pdf
  4. Girshick, Ross, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition. 2014. pdf RCNN
  5. He, Kaiming, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition. European Conference on Computer Vision. Springer International Publishing, 2014. pdf SPPNet
  6. Girshick, Ross. Fast r-cnn. Proceedings of the IEEE International Conference on Computer Vision. 2015. pdf️️️️
  7. Ren, Shaoqing, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in neural information processing systems. 2015. pdf ️️️️
  8. Redmon, Joseph, et al. You only look once: Unified, real-time object detection. arXiv preprint arXiv:1506.02640 (2015).pdf
  9. Liu, Wei, et al. SSD: Single Shot MultiBox Detector. arXiv preprint arXiv:1512.02325 (2015). pdf
  10. Dai, Jifeng, et al. R-FCN: Object Detection via Region-based Fully Convolutional Networks. arXiv preprint arXiv:1605.06409 (2016).pdf
  11. K. He et al. Mask R-CNN arXiv preprint arXiv:1703.06870 (2017). pdf
  12. Tsung-Yi Lin et al. Feature Pyramid Networks for Object Detection. arXiv:1612.03144 (2017). pdf
  13. Esteban Real, Alok Aggarwal, Yanping Huang: Regularized Evolution for Image Classifier Architecture Search, 2018; arXiv:1802.01548 pdf
  14. Golnaz Ghiasi, Tsung-Yi Lin, Ruoming Pang: NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection, 2019; arXiv:1904.07392 pdf
  15. Chenchen Zhu, Yihui He: Feature Selective Anchor-Free Module for Single-Shot Object Detection, 2019; arXiv:1903.00621 pdf
  16. Yukang Chen, Tong Yang, Xiangyu Zhang, Gaofeng Meng, Xinyu Xiao: DetNAS: Backbone Search for Object Detection, 2019; arXiv:1903.10979 pdf
  17. Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang: CenterNet: Keypoint Triplets for Object Detection, 2019; arXiv:1904.08189 pdf
  18. Mingxing Tan, Ruoming Pang: EfficientDet: Scalable and Efficient Object Detection, 2019; arXiv:1911.09070 pdf

4 Object Segmentation and Self-Supervised Learning

Segmentation:

  1. J. Long, E. Shelhamer, and T. Darrell, Fully convolutional networks for semantic segmentation. in CVPR, 2015. pdf
  2. O. Ronnenberger et al. U-Net: Convolutional Networks for Biomedical Image Segmentation. 2015. pdf
  3. Multi-Scale Context Aggregation by Dilated Convolutions. 2016. pdf
  4. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. 2016. pdf
  5. Rethinking Atrous Convolution for Semantic Image Segmentation. 2017.  pdf
  6. K. He et al. Mask R-CNN arXiv preprint arXiv:1703.06870. 2017. pdf
  7. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. 2018. pdf
  8. Learning to Segment Everything. 2018. pdf

Self-Supervised Learning:

  1. Unsupervised Visual Representation Learning by Context Prediction. 2015. pdf
  2. Colorful Image Colorization. 2016. pdf
  3. Representation Learning by Learning to Count. 2017. pdf
  4. Learning and Using the Arrow of Time. 2018. pdf
  5. Tracking Emerges by Colorizing Videos. 2018. pdf
  6. Audio-Visual Scene Analysis with Self-Supervised Multi-sensory Features. 2018. pdf
  7. Object Discovery with a Copy-Pasting GAN. 2019. pdf
  8. SimCLR: A Simple Framework for Contrastive Learning of Representations. 2020. pdf

5 Generative Adversarial Networks and Applications

Generative Adversarial Networks:

  1. Kingma, D, and Welling, M. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013). pdf
  2. Goodfellow, Ian, et al. Generative adversarial nets.  2014. pdf
  3. Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. Pixel recurrent neural networks. arXiv preprint arXiv:1601.06759 (2016). pdf
  4. Makzhani, Alireza, et al. Adversarial Autoencoders arXiv:1511.05644 (2015). pdf
  5. Gregor, Karol, et al. DRAW: A recurrent neural network for image generation. arXiv:1502.04623 (2015). pdf

Applications:

  1. Wasserstein GAN.  2017. pdf
  2. Large Scale GAN Training for High Fidelity Natural Image Synthesis. 2018. pdf
  3. A Style-based Generator Architecture for Generative Adversarial Networks 2018. pdf
  4. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks 2017. pdf
  5. Conditional LSTM-GAN for Melody Generation from Lyrics. 2019. pdf
  6. GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction. 2019. pdf

Art:

  1. Mordvintsev, Alexander; Olah, Christopher; Tyka, Mike (2015). Inceptionism: Going Deeper into Neural Networks. Google Research. html
  2. Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015). pdf
  3. CAN: Creative Adversarial Networks 2017. pdf
  4. Semantic Image Synthesis with Spatially-Adaptive Normalization 2019. pdf
  5. Deep Poetry: Word-Level and Char-Level Language Models for Shakespearean Sonnet Generation pdf
  6. BachProp: Learning to Compose Music in Multiple Styles 2018. pdf
  7. A 'New' Rembrandt: From the Frontiers of AI And Not The Artist's Atelier 2016. html
  8. Is artificial intelligence set to become art’s next medium? 2018. html
  9. AI Will Enhance - Not End - Human Art 2019. html
  10. An AI-Written Novella Almost Won a Literary Prize 2016. html
  11. How AI-Generated Music Is Changing The Way Hits Are Made 2018.html
  12. AI puts final notes on Beethoven's Tenth Symphony 2019. html

Previous Papers

  1. Zhu, Jun-Yan, et al. Generative Visual Manipulation on the Natural Image Manifold. European Conference on Computer Vision. Springer International Publishing, 2016. pdf
  2. Champandard, Alex J. Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks. arXiv preprint arXiv:1603.01768 (2016). pdf
  3. Johnson, Justin, Alexandre Alahi, and Li Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. arXiv preprint arXiv:1603.08155 (2016). pdf ️️️️
  4. Vincent Dumoulin, Jonathon Shlens and Manjunath Kudlur. A learned representation for artistic style. arXiv preprint arXiv:1610.07629 (2016). pdf ️️️️
  5. Gatys, Leon and Ecker, et al.Controlling Perceptual Factors in Neural Style Transfer. arXiv preprint arXiv:1611.07865 (2016). pdf
  6. Ulyanov, Dmitry and Lebedev, Vadim, et al. Texture Networks: Feed-forward Synthesis of Textures and Stylized Images. arXiv preprint arXiv:1603.03417(2016). pdf

6 RNN / Sequence-to-Sequence Model

  1. Bengio, Yoshua et. al. A Neural Probabilistic Model JMLR (2003). pdf
  2. Graves, Alex. Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850 (2013). (LSTM, very nice generating result, show the power of RNN) pdf
  3. Mikolov, et al. Distributed representations of words and phrases and their compositionality. NIPS(2013) pdf
  4. Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. Sequence to sequence learning with neural networks. Advances in neural information processing systems. 2014.pdf
  5. Bahdanau, Dzmitry, KyungHyun Cho, and Yoshua Bengio. Neural Machine Translation by Jointly Learning to Align and Translate. (2014). pdf

7 NLP (Natural Language Processing)

  1. Ashish Vaswani, et al. Attention is All you Need. NIPS (2017) pdf
  2. Matthew Peters, et al. Deep Contexualized Word Representations. pdf
  3. Jeremy Howard, et al.  Universal Language Model Fine-Tuning for Text Classification ACL (2018) pdf
  4. 4. Jacob Devlin, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (2019) pdf
  5. 5. Victor Sanh, et al. DistilBERT, a distilled version of BERT. arXiv preprint arXiv:1910.01108(2019) pdf

8 Machine Translation

  1. Lee, et al. Fully Character-Level Neural Machine Translation without Explicit Segmentation. (2016) pdf
  2. Wu, Schuster, Chen, Le, et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. pdf
  3. Jonas Gehring, et al. Convolutional Sequence to Sequence Learning. (2017). pdf
  4. Lample, et al. Phrase-Based & Neural Unsupervised Machine Translation. (2018) pdf
  5. Ye Jia, et al. Direct Speech-to-Speech Translation with a Sequence-to-Sequence Model. (2019). pdf

9 Applications of Sequence-to-Sequence Models

  1. Wen, et al. Recurrent Neural Network Language Generation for Spoken Dialogue Systems. (2019) pdf
  2. Mrksic, et al. Multi-domain Dialog State Tracking using RNNs. (2015) pdf
  3. Srinivasan, et al. Natural Language Generation using Reinforcement Learning with External Rewards. (2019). pdf
  4. Zhu, et al. SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering. (2018) pdf
  5. Xiong, et al. Achieving Human Parity in Conversational Speech Recognition. arXiv:1610.05256 (2016). pdf

10 Reinforcement Learning

  1. Mnih, Volodymyr, et al. Playing atari with deep reinforcement learning. (2013). pdf
  2. Silver, David, et al. Mastering the game of Go with deep neural networks and tree search. (2016) pdf
  3. Silver, David, et al. Mastering the game of Go without Human Knowledge. (2017) pdf
  4. Silver, David, et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. (2017) pdf
  5. OpenAI. Learning Dexterous In-Hand Manipulation. pdf

Previous Papers

  1. Mnih, Volodymyr, et al. Human-level control through deep reinforcement learning. (2015) pdf
  2. Wang, Ziyu, Nando de Freitas, and Marc Lanctot. Dueling network architectures for deep reinforcement learning. (2015). pdf
  3. Mnih, Volodymyr, et al. Asynchronous methods for deep reinforcement learning. (2016). pdf
  4. Lillicrap, Timothy P., et al. Continuous control with deep reinforcement learning. (2015). pdf
  5. Gu, Shixiang, et al. Continuous Deep Q-Learning with Model-based Acceleration. (2016). pdf
  6. Schulman, John, et al. Trust region policy optimization. CoRR, abs/1502.05477 (2015). pdf

11 Unsupervised Learning / Deep Generative Model

  1. Le, Quoc V. Building high-level features using large scale unsupervised learning. pdf
  2. Kingma, Diederik P., and Max Welling. Auto-encoding variational bayes. (2013). pdf
  3. Goodfellow, Ian, et al. Generative adversarial nets. Advances in Neural Information Processing Systems. 2014. pdf
  4. Radford, Alec, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. (2015). pdf
  5. Gregor, Karol, et al. DRAW: A recurrent neural network for image generation. (2015). pdf
  6. Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. Pixel recurrent neural networks. (2016). pdf
  7. Oord, Aaron van den, et al. Conditional image generation with PixelCNN decoders. (2016). pdf

12 Image Captioning**

  1. Farhadi,Ali,etal. Every picture tells a story: Generating sentences from images. 2010. pdf ️️️
  2. Kulkarni, Girish, et al. Baby talk: Understanding and generating image descriptions. 2011. pdf️️️️
  3. Vinyals, Oriol, et al. Show and tell: A neural image caption generator. 2014. pdf️️️
  4. Donahue, Jeff, et al. Long-term recurrent convolutional networks for visual recognition and description. pdf
  5. Karpathy, Andrej, and Li Fei-Fei. Deep visual-semantic alignments for generating image descriptions. 2014. pdf️️️️️
  6. Karpathy, Andrej, Armand Joulin, and Fei Fei F. Li. Deep fragment embeddings for bidirectional image sentence mapping. 2014. pdf️️️️
  7. Fang, Hao, et al. From captions to visual concepts and back. 2014. pdf️️️️️
  8. Chen, Xinlei, and C. Lawrence Zitnick. Learning a recurrent visual representation for image caption generation. 2014. pdf️️️️
  9. Mao, Junhua, et al. Deep captioning with multimodal recurrent neural networks 2014. pdf️️️
  10. Xu, Kelvin, et al. Show, attend and tell: Neural image caption generation with visual attention. 2015. pdf️️️

13 Speech Recognition

  1. Hinton, Geoffrey, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. (2012) pdf
  2. Graves, Alex, Abdel-rahman Mohamed, and Geoffrey Hinton. Speech recognition with deep recurrent neural networks. 2013 pdf
  3. Graves, Alex, and Navdeep Jaitly. Towards End-To-End Speech Recognition with Recurrent Neural Networks. 2014 pdf️️️
  4. Sak, Haşim, et al. Fast and accurate recurrent neural network acoustic models for speech recognition. (2015).  pdf
  5. Amodei, Dario, et al. Deep speech 2: End-to-end speech recognition in english and mandarin. (2015). pdf
  6. W. Xiong, J. Droppo, X. Huang, F. Seide, M. Seltzer, A. Stolcke, D. Yu, G. Zweig Achieving Human Parity in Conversational Speech Recognition. (2016) pdf

14 Deep Learning Optimization and More

  1. Hinton, Geoffrey E., et al. Improving neural networks by preventing co-adaptation of feature detectors. pdf
  2. Srivastava, Nitish, et al. Dropout: a simple way to prevent neural networks from overfitting. pdf
  3. Ioffe, Sergey, and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. pdf
  4. Ba, Jimmy Lei, Jamie Ryan Kiros, and Geoffrey E. Hinton. Layer normalization. pdf
  5. Courbariaux, Matthieu, et al. Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to+ 1 or−1. pdf
  6. Jaderberg, Max, et al. Decoupled neural interfaces using synthetic gradients.pdf
  7. Chen, Tianqi, Ian Goodfellow, and Jonathon Shlens. Net2net: Accelerating learning via knowledge transfer. pdf
  8. Wei, Tao, et al. Network Morphism. arXiv preprint arXiv:1603.01670 (2016). pdf
  9. Sutskever, Ilya, et al. On the importance of initialization and momentum in deep learning. pdf
  10. Kingma, Diederik, and Jimmy Ba. Adam: A method for stochastic optimization. pdf
  11. Andrychowicz, Marcin, et al. Learning to learn by gradient descent by gradient descent. pdf
  12. Han, Song, Huizi Mao, and William J. Dally. Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. pdf
  13. Iandola, Forrest N., et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 1MB model size. pdf ️️️️

15 Robotics


[14.0] Koutník, Jan, et al. Evolving large-scale neural networks for vision-based reinforcement learning. Proceedings of the 15th annual conference on Genetic and evolutionary computation. ACM, 2013. [pdf] ️️️ [14.1] Levine, Sergey, et al. End-to-end training of deep visuomotor policies. Journal of Machine Learning Research 17.39 (2016): 1-40. [pdf] ️️️️️ [14.2] Pinto, Lerrel, and Abhinav Gupta. Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours. arXiv preprint arXiv:1509.06825 (2015). [pdf] ️️️ [14.3] Levine, Sergey, et al. Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection. arXiv preprint arXiv:1603.02199 (2016). [pdf] ️️️️ [14.4] Zhu, Yuke, et al. Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning. arXiv preprint arXiv:1609.05143 (2016). [pdf] ️️️️ [14.5] Yahya, Ali, et al. Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search. arXiv preprint arXiv:1610.00673 (2016). [pdf] ️️️️ [14.6] Gu, Shixiang, et al. Deep Reinforcement Learning for Robotic Manipulation. arXiv preprint arXiv:1610.00633 (2016). [pdf] ️️️️ [14.7] A Rusu, M Vecerik, Thomas Rothörl, N Heess, R Pascanu, R Hadsell.Sim-to-Real Robot Learning from Pixels with Progressive Nets. arXiv preprint arXiv:1610.04286 (2016). [pdf] ️️️️ [14.8] Mirowski, Piotr, et al. Learning to navigate in complex environments. arXiv preprint arXiv:1611.03673 (2016). [pdf]️️️️

16 Deep Transfer Learning / Lifelong Learning / especially for RL

[15.0] Bengio, Yoshua. Deep Learning of Representations for Unsupervised and Transfer Learning. ICML Unsupervised and Transfer Learning 27 (2012): 17-36. [pdf] **(**A Tutorial) ️️️ [15.1] Silver, Daniel L., Qiang Yang, and Lianghao Li. Lifelong Machine Learning Systems: Beyond Learning Algorithms. AAAI Spring Symposium: Lifelong Machine Learning. 2013. [pdf] **(**A brief discussion about lifelong learning)  ️️️ [15.2] Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015). [pdf] **(**Godfather's Work) ️️️️ [15.3] Rusu, Andrei A., et al. Policy distillation. arXiv preprint arXiv:1511.06295 (2015). [pdf] **(**RL domain) ️️️ [15.4] Parisotto, Emilio, Jimmy Lei Ba, and Ruslan Salakhutdinov. Actor-mimic: Deep multitask and transfer reinforcement learning. arXiv preprint arXiv:1511.06342 (2015). [pdf] **(**RL domain) ️️️ [15.5] Rusu, Andrei A., et al. Progressive neural networks. arXiv preprint arXiv:1606.04671 (2016). [pdf] **(**Outstanding Work, A novel idea) ️️️️️

17 One Shot Deep Learning

[16.0] Lake, Brenden M., Ruslan Salakhutdinov, and Joshua B. Tenenbaum. Human-level concept learning through probabilistic program induction. Science 350.6266 (2015): 1332-1338. [pdf] **(****No Deep Learning, but worth reading)**️️️️️ [16.1] Koch, Gregory, Richard Zemel, and Ruslan Salakhutdinov. Siamese Neural Networks for One-shot Image Recognition.(2015) [pdf] ️️️ [16.2] Santoro, Adam, et al. One-shot Learning with Memory-Augmented Neural Networks. arXiv preprint arXiv:1605.06065 (2016). [pdf] **(**A basic step to one shot learning) ️️️️ [16.3] Vinyals, Oriol, et al. Matching Networks for One Shot Learning. arXiv preprint arXiv:1606.04080 (2016). [pdf]️️️ [16.4] Hariharan, Bharath, and Ross Girshick. Low-shot visual object recognition. arXiv preprint arXiv:1606.02819 (2016). [pdf] **(**A step to large data) ️️️️

18 Neural Turing Machine

[17.0] Graves, Alex, Greg Wayne, and Ivo Danihelka. Neural turing machines. arXiv preprint arXiv:1410.5401 (2014). [pdf] (Basic Prototype of Future Computer) ️️️️️ [17.1] Zaremba, Wojciech, and Ilya Sutskever. Reinforcement learning neural Turing machines. arXiv preprint arXiv:1505.00521 362 (2015). [pdf] ️️️ [17.2] Weston, Jason, Sumit Chopra, and Antoine Bordes. Memory networks. arXiv preprint arXiv:1410.3916 (2014). [pdf]️️️ [17.3] Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. End-to-end memory networks. Advances in neural information processing systems. 2015. [pdf] ️️️️ [17.4] Vinyals, Oriol, Meire Fortunato, and Navdeep Jaitly. Pointer networks. Advances in Neural Information Processing Systems. 2015. [pdf] ️️️️ [17.5] Graves, Alex, et al. Hybrid computing using a neural network with dynamic external memory. Nature (2016). [pdf]  ️️️️️

credit Prof. Peter N Belhumeur

About