BIGBALLON / Paper_List

Paper reading list during my graduate studies

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

My Paper Reading List

Convolutional Neural Network

  • (LeNet) LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998).
  • (AlexNet) Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. (2012).
  • (ZFNet) Zeiler, Matthew D., and Rob Fergus. "Visualizing and understanding convolutional networks." European conference on computer vision. Springer, Cham, (2014).
  • (NIN) Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." (2013). [arXiv:1312.4400]
  • (VGGNet) Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition."(2014). [arXiv:1409.1556]
  • (GoogLeNet) Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
  • (BN) Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." International Conference on Machine Learning. (2015). [arXiv:1502.03167]
  • (ResNet) He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. (2016). [arXiv:1512.03385] [CVPR 2016 Best Paper] ⭐
  • (Pre-active) He, Kaiming, et al. "Identity mappings in deep residual networks." European Conference on Computer Vision. Springer International Publishing. (2016). [arXiv:1603.05027]
  • Huang, Gao, et al. "Deep networks with stochastic depth." European Conference on Computer Vision. Springer, Cham, 2016. [arXiv:1603.09382]
  • (Wide ResNet) Zagoruyko, Sergey, and Nikos Komodakis. "Wide residual networks." (2016). [arXiv:1605.07146]
  • (ResNeXt) Xie, Saining, et al. "Aggregated residual transformations for deep neural networks." 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, (2017). [arXiv:1611.05431]
  • (DenseNet) Huang, Gao, et al. "Densely connected convolutional networks." (2016). [arXiv:1608.06993]
  • Pleiss, Geoff, et al. "Memory-efficient implementation of densenets." arXiv preprint (2017). [arXiv:1707.06990]
  • (DPN) Chen, Yunpeng, et al. "Dual path networks." Advances in Neural Information Processing Systems. (2017). [arXiv:1707.01629]
  • (SENet) Hu, Jie, Li Shen, and Gang Sun. "Squeeze-and-excitation networks." (2017). [arXiv:1709.01507]
  • (CondenseNet) Huang, Gao, et al. "CondenseNet: An Efficient DenseNet using Learned Group Convolutions." (2017). [arXiv:1711.09224]
  • (GN) Yuxin Wu, Kaiming He. "Group Normalization." (2018). [arXiv:1803.08494]

Optimizers

  • Kingma, Diederik P., and Jimmy Ba. "Adam: A method for stochastic optimization." arXiv preprint (2014). [arXiv:1412.6980]
  • Ruder, Sebastian. "An overview of gradient descent optimization algorithms." arXiv preprint (2016). [arXiv:1609.04747]
  • Keskar, Nitish Shirish, and Richard Socher. "Improving Generalization Performance by Switching from Adam to SGD." arXiv preprint (2017). [arXiv:1712.07628]
  • Loshchilov, Ilya, and Frank Hutter. "SGDR: stochastic gradient descent with restarts." arXiv preprint (2016). [arXiv:1608.03983] ⭐
  • Smith, Leslie N. "Cyclical learning rates for training neural networks." Applications of Computer Vision (WACV), 2017 IEEE Winter Conference on. IEEE, 2017. [arXiv:1506.01186]
  • Gastaldi, Xavier. "Shake-shake regularization." arXiv preprint (2017). [arXiv:1705.07485]
  • Huang, Gao, et al. "Snapshot ensembles: Train 1, get M for free." arXiv preprint (2017). [arXiv:1704.00109]
  • Jaderberg, Max, et al. "Population based training of neural networks." arXiv preprint (2017). [arXiv:1711.09846]

Generative Adversarial Network

  • Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. (2014). [arXiv:1406.2661]
  • Mirza, Mehdi, and Simon Osindero. "Conditional generative adversarial nets." (2014). [arXiv:1411.1784]
  • Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." (2015). [arXiv:1511.06434]
  • Reed, Scott, et al. "Generative adversarial text to image synthesis." (2016). [arXiv:1605.05396]
  • Shrivastava, Ashish, et al. "Learning from simulated and unsupervised images through adversarial training."(2016). [arXiv:1612.07828]
  • Arjovsky, Martin, Soumith Chintala, and Léon Bottou. "Wasserstein gan." (2017). [arXiv:1701.07875]

(Deep) Reinforcement Learning

  • Value-based
    • (DQN) Deep Q Network
      • Mnih, Volodymyr, et al. "Playing atari with deep reinforcement learning." (2013). [arXiv:1312.5602]
      • Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning."(2015). [Nature 518.7540] ⭐
    • Other improvements:
      • (DDQN) Van Hasselt, Hado, Arthur Guez, and David Silver. "Deep Reinforcement Learning with Double Q-Learning." AAAI. 2016. [arXiv:1509.06461]
      • Schaul, Tom, et al. "Prioritized experience replay."(2015). [arXiv:1511.05952]
      • Wang, Ziyu, et al. "Dueling network architectures for deep reinforcement learning." (2015). [arXiv:1511.06581] [ICML2016 Best Paper]
  • Actor-Critic
    • (DDPG) Lillicrap, Timothy P., et al. "Continuous control with deep reinforcement learning." (2015). [arXiv:1509.02971]
    • (A3C) Mnih, Volodymyr, et al. "Asynchronous methods for deep reinforcement learning." ICML (2016). [arXiv:1602.01783] ⭐
    • (ACER) Wang, Ziyu, et al. "Sample efficient actor-critic with experience replay." (2016). [arXiv:1611.01224]
    • (ACKTR) Wu, Yuhuai, et al. "Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation." Advances in Neural Information Processing Systems. (2017). [arXiv:1708.05144]
  • More
    • (UNREAL) Jaderberg, Max, et al. "Reinforcement learning with unsupervised auxiliary tasks." (2016). [arXiv:1611.05397]
    • (TRPO) Schulman, John, et al. "Trust region policy optimization." Proceedings of the 32nd International Conference on Machine Learning (ICML-15). (2015). [arXiv:1502.05477]
    • (DPPO) Schulman, John, et al. "Proximal policy optimization algorithms." (2017). [arXiv:1707.06347]
    • Heess, Nicolas, et al. "Emergence of locomotion behaviours in rich environments." (2017). [arXiv:1707.02286]
    • Hessel, Matteo, et al. "Rainbow: Combining Improvements in Deep Reinforcement Learning." (2017). [arXiv:1710.02298]
    • Andrychowicz, Marcin, et al. "Learning to learn by gradient descent by gradient descent." Advances in Neural Information Processing Systems. (2016). [arXiv:1606.04474]
    • (GAIL)Ho, Jonathan, and Stefano Ermon. "Generative adversarial imitation learning." Advances in Neural Information Processing Systems. (2016). [arXiv:1606.03476]
    • (InfoGAIL)Li, Yunzhu, Jiaming Song, and Stefano Ermon. "InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations." Advances in Neural Information Processing Systems. (2017). [arXiv:1703.08840]
    • Lample, Guillaume, and Devendra Singh Chaplot. "Playing FPS Games with Deep Reinforcement Learning." AAAI. (2017). [arXiv:1609.05521]
    • O'Donoghue, Brendan, et al. "Combining policy gradient and Q-learning." (2016). [arXiv:1611.01626]
    • Merel, Josh, et al. "Learning human behaviors from motion capture by adversarial imitation." (2017). [arXiv:1707.02201]
    • Liu, YuXuan, et al. "Imitation from observation: Learning to imitate behaviors from raw video via context translation." (2017). [arXiv:1707.03374]
    • Hester, Todd, et al. "Deep Q-learning from Demonstrations." Proceedings of the Conference on Artificial Intelligence (AAAI). (2018). [arXiv:1704.03732]

Computer Games

  • 2048 Like Games
    • Szubert, Marcin, and Wojciech Jaśkowski. "Temporal difference learning of n-tuple networks for the game 2048." Computational Intelligence and Games (CIG), IEEE Conference on. IEEE, (2014).
    • Wu, I-Chen, et al. "Multi-stage temporal difference learning for 2048." Technologies and Applications of Artificial Intelligence. Springer, Cham, (2014).
    • Yeh, Kun-Hao, et al. "Multi-stage temporal difference learning for 2048-like games." IEEE Transactions on Computational Intelligence and AI in Games (2016).
    • Jaskowski, Wojciech. "Mastering 2048 with Delayed Temporal Coherence Learning, Multi-Stage Weight Promotion, Redundant Encoding and Carousel Shaping." IEEE Transactions on Computational Intelligence and AI in Games (2017). ⭐
  • MCTS
    • (UCS)Brügmann, Bernd. "Monte carlo go". Vol. 44. Syracuse, NY: Technical report, Physics Department, Syracuse University, (1993).
    • (UCB)Auer, Peter, Nicolo Cesa-Bianchi, and Paul Fischer. "Finite-time analysis of the multiarmed bandit problem." Machine learning 47.2-3 (2002): 235-256.
    • (UCT)Kocsis, Levente, and Csaba Szepesvári. "Bandit based monte-carlo planning." European conference on machine learning. Springer, Berlin, Heidelberg, 2006.
    • (MCTS)Coulom, Rémi. "Efficient selectivity and backup operators in Monte-Carlo tree search." International conference on computers and games. Springer, Berlin, Heidelberg, 2006.
    • (RAVE)Gelly, Sylvain, and David Silver. "Monte-Carlo tree search and rapid action value estimation in computer Go." Artificial Intelligence 175.11 (2011): 1856-1875.
    • Gelly and David. "Combining online and offline knowledge in UCT." ICML 2007.
      • ICML 2017: Test of Time Award
    • Chaslot, Guillaume MJ-B. "Parallel monte-carlo tree search." International Conference on Computers and Games. Springer, Berlin, Heidelberg, (2008).
    • Segal, Richard B. "On the scalability of parallel UCT." International Conference on Computers and Games. Springer, Berlin, Heidelberg, (2010).
    • Browne, Cameron B., et al. "A survey of monte carlo tree search methods." IEEE Transactions on Computational Intelligence and AI in games 4.1 (2012): 1-43.
  • AlphaGo
    • Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489. ⭐
      • APV-MCTS
    • Silver, David, et al. "Mastering the game of go without human knowledge." Nature 550.7676 (2017): 354. ⭐
    • Silver, David, et al. "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm." (2017). [arXiv:1712.01815] ⭐
    • Silver, David, et al. "A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play." Science 362.6419 (2018): 1140-1144.
  • More
    • Silver, David, Richard S. Sutton, and Martin Müller. "Temporal-difference search in computer Go." Machine learning 87.2 (2012): 183-219.
    • Lai, Matthew. "Giraffe: Using deep reinforcement learning to play chess." (2015). [arXiv:1509.01549]
    • Vinyals, Oriol, et al. "StarCraft II: a new challenge for reinforcement learning." (2017). [arXiv:1708.04782]
    • Maddison, Chris J., et al. "Move evaluation in go using deep convolutional neural networks." (2014). [arXiv:1412.6564]
    • Soeda, Shunsuke, Tomoyuki Kaneko. "Dual lambda search and shogi endgames." Advances in Computer Games. Springer, Berlin, Heidelberg, (2005).
    • (Darkforest)Tian, Yuandong, and Yan Zhu. "Better computer go player with neural network and long-term prediction." arXiv:1511.06410 (2015).
    • Cazenave, Tristan. "Residual networks for computer Go." IEEE Transactions on Games 10.1 (2018): 107-110.
    • Gao, Chao, Martin Müller, and Ryan Hayward. "Three-Head Neural Network Architecture for Monte Carlo Tree Search." IJCAI. (2018).
    • (ELF)Tian, Yuandong, et al. "Elf: An extensive, lightweight and flexible research platform for real-time strategy games." Advances in Neural Information Processing Systems. (2017).
    • (ELF2)Tian, Yuandong, et al. "ELF OpenGo: An Analysis and Open Reimplementation of AlphaZero." arXiv:1902.04522 (2019).

Others

About

Paper reading list during my graduate studies