-----------------------------------------------------------------------------------------
| I am very new to this field, what papers should I read so as to take one step forward? |
-----------------------------------------------------------------------------------------
There have been billions of academic papers around the world. However, maybe only 0.0...01% among them are valuable or are worth reading. Since our limited life has never been forever, TopPaper provide a Top Academic Paper Chart for beginners and reseachers to take one step faster.
Welcome to contribute more subject or valuable (at least you think) papers. Please feel free to pull requests or open an issue.
- 0. Traditional Methods
- 1. CNN - Convolutional Neural Network
- 1.1 Image Classification
- 1.2 Object Detection
- 1.3 Object Segmentation
- 1.4 Re_ID Person Re-Identification
- 1.5 OCR Optical Character Recognition
- 1.6 Face Recognition
- 1.7 NAS Neural Architecture Search
- 1.8 Image Super_Resolution
- 1.9 Image Denoising
- 1.10 Model Compression - Decomposition, Pruning, Quantization, KD
- 2. Transformer in Vision
- 3. Transformer and Self-Attention in NLP
- 4. Others
- Acknowledgement
Abbreviation | Paper | Cited by | Journal | Year | 1st Author | 1st Affiliation |
---|---|---|---|---|---|---|
SIFT | Object Recognition from Local Scale-Invariant Features | 20 K | ICCV | 1999 | David G. Lowe | University of British Columbia |
HOG | Histograms of Oriented Gradients for Human Detection | 35 K | CVPR | 2005 | Navneet Dalal | inrialpes |
SURF | SURF: Speeded Up Robust Features | 18 K | ECCV | 2006 | Herbert Bay | ETH Zurich |
...... |
Abbreviation | Paper | Cited By | Journal | Year | 1st Author | 1st Affiliation |
---|---|---|---|---|---|---|
LeNet | Backpropagation applied to handwritten zip code recognition | 8.3 K | Neural Computation | 1989 | Yann Lecun | AT&T Bell Laboratories |
LeNet | Gradient-based learning applied to document recognition | 35 K | Proceedings of the IEEE | 1998 | Yann Lecun | AT&T Research Laboratories |
ImageNet | ImageNet: A large-scale hierarchical image database | 26 K | CVPR | 2009 | Jia Dengn | Princeton University |
AlexNet | ImageNet Classification with Deep Convolutional Neural Networks | 79 K | NIPS | 2012 | Alex Krizhevsky | University of Toronto |
ZFNet | Visualizing and Understanding Convolutional Networks | 11 K | ECCV | 2014 | Matthew D Zeiler | New York University |
VGGNet | Very Deep Convolutional Networks for Large-Scale Image Recognition | 55 K | ICLR | 2015 | Karen Simonyan | Oxford |
GoogLeNet | Going Deeper with Convolutions | 29 K | CVPR | 2015 | Christian Szegedy | |
GoogLeNet_v2_v3 | Rethinking the Inception Architecture for Computer Vision | 12 K | CVPR | 2016 | Christian Szegedy | |
ResNet | Deep Residual Learning for Image Recognition | 74 K | CVPR | 2016 | Kaiming He | MSRA |
DenseNet | Densely Connected Convolutional Networks | 15 K | CVPR | 2017 | Gao Huang | Cornell University |
ResNeXt | Aggregated Residual Transformations for Deep Neural Networks | 3.9 K | CVPR | 2017 | Saining Xie | UC San Diego |
MobileNet | MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications | 7.7 K | arXiv | 2017 | Andrew G. Howard | |
SENet | Squeeze-and-Excitation Networks | 6.3 K | CVPR | 2018 | Jie Hu | Momenta |
MobileNet_v2 | MobileNetV2: Inverted Residuals and Linear Bottlenecks | 4.4 K | CVPR | 2018 | Mark Sandler | |
ShuffleNet | ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices | 2.3 K | CVPR | 2018 | Xiangyu Zhang | Megvii |
ShuffleNet V2 | ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design | 1.3 K | ECCV | 2018 | Ningning Ma | Megvii |
MobileNet_v3 | Searching for MobileNetV3 | 0.6 K | ICCV | 2019 | Andrew Howard | |
EfficientNet | EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks | 1.9 K | ICML | 2019 | Mingxing Tan | |
GhostNet | GhostNet: More Features from Cheap Operations | 0.1 K | CVPR | 2020 | Kai Han | Huawei Noah |
AdderNet | AdderNet: Do We Really Need Multiplications in Deep Learning? | 33 | CVPR | 2020 | Hanting Chen | Huawei Noah |
Res2Net | Res2Net: A New Multi-scale Backbone Architecture | 0.2 K | TPAMI | 2021 | Shang-Hua Gao | Nankai University |
Abbreviation | Paper | Cited By | Journal | Year | 1st Author | 1st Affiliation |
---|---|---|---|---|---|---|
BN | Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift | 26 K | ICML | 2015 | Sergey Ioffe | |
Xavier Init | Understanding the difficulty of training deep feedforward neural networks | 12 K | AISTATS | 2010 | Xavier | Universite de Montreal |
Kaiming Init | Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification | 11 K | ICCV | 2015 | Kaiming He | MSRA |
LN | Layer Normalization | 2.9 K | NIPS | 2016 | Jimmy Lei Ba | University of Toronto |
GN | Group Normalization | 1.1 K | ECCV | 2018 | Yuxin Wu | FAIR |
- | Bag of Tricks for Image Classification with Convolutional Neural Networks | 361 | CVPR | 2019 | Tong He | Amazon |
- | Fixing the train-test resolution discrepancy | 122 | NeurIPS | 2019 | Hugo Touvron | FAIR |
Auto-Augment | AutoAugment: Learning Augmentation Policies from Data | 487 | CVPR | 2019 | Ekin D. Cubuk | |
- | Fixing the train-test resolution discrepancy: FixEfficientNet | 53 | Arxiv | 2020 | Hugo Touvron | FAIR |
Abbreviation | Paper | Cited By | Journal | Year | 1st Author | 1st Affiliation |
---|---|---|---|---|---|---|
RCNN | Rich feature hierarchies for accurate object detection and semantic segmentation | 17 K | CVPR | 2014 | Ross Girshick | Berkeley |
Fast RCNN | Fast R-CNN | 14 K | ICCV | 2015 | Ross Girshick | Microsoft Research |
Faster RCNN | Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks | 20 K | NIPS | 2015 | Shaoqing Ren | USTC, MSRA |
SSD | SSD: Single Shot MultiBox Detector | 13 K | ECCV | 2016 | Wei Liu | UNC |
YOLO | You Only Look Once: Unified, Real-Time Object Detection | 15 K | CVPR | 2016 | Joseph Redmon | University of Washington |
Mask RCNN | Mask R-CNN | 10 K | ICCV | 2017 | Kaiming He | FAIR |
DSSD | DSSD : Deconvolutional Single Shot Detector | 1.0 K | CVPR | 2017 | Cheng-Yang Fu | UNC |
YOLO9000 | YOLO9000: Better, Faster, Stronger. | 7.7 K | CVPR | 2017 | Joseph Redmon | University of Washington |
FPN | Feature Pyramid Networks for Object Detection | 6.7 K | CVPR | 2017 | Tsung-Yi Lin | FAIR |
Focal Loss | Focal Loss for Dense Object Detection | 6.7 K | ICCV | 2017 | Tsung-Yi Lin | FAIR |
Deformable Conv | Deformable Convolutional Networks | 1.6 K | ICCV | 2017 | Jifeng Dai | MSRA |
YOLO V3 | Yolov3: An incremental improvement | 6.9 K | CVPR | 2018 | Joseph Redmon | University of Washington |
ATSS | Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection | 0.1 K | CVPR | 2020 | Shifeng Zhang | CASIA |
EfficientDet | EfficientDet: Scalable and Efficient Object Detection | 0.3 K | CVPR | 2020 | Mingxing Tan |
Abbreviation | Paper | Cited By | Journal | Year | 1st Author | 1st Affiliation |
---|---|---|---|---|---|---|
FCN | Fully Convolutional Networks for Semantic Segmentation | 22 K | CVPR | 2015 | Jonathan Long | UC Berkeley |
DeepLab | DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs | 7.4 K | ICLR | 2015 | Liang-Chieh Chen | |
Unet | U-Net: Convolutional Networks for Biomedical Image Segmentation | 24 K | MICCAI | 2015 | Olaf Ronneberger | University of Freiburg |
- | Learning to Segment Object Candidates | 0.6 K | NIPS | 2015 | Pedro O. Pinheiro | FAIR |
Dilated Conv | Multi-Scale Context Aggregation by Dilated Convolutions | 4.5 K | ICLR | 2016 | Fisher Y | Princeton University |
- | Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network | 0.7 K | CVPR | 2017 | Chao Peng | Tsinghua |
RefineNet | RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation | 1.6 K | CVPR | 2017 | Guosheng Lin | The University of Adelaide |
Abbreviation | Paper | Cited by | Journal | Year | 1st Author | 1st Affiliation |
---|---|---|---|---|---|---|
CTC | Connectionist temporal classifaction: labelling unsegmented sequence data with recurrent neural network | 2.9 K | ICML | 2006 | Alex Graves | IDSIA |
Abbreviation | Paper | Cited By | Journal | Year | 1st Author | 1st Affiliation |
---|---|---|---|---|---|---|
Darts | DARTS: Differentiable Architecture Search | 1.3 K | ICLR | 2019 | Hanxiao Liu | CMU |
- | Neural Architecture Search with Reinforcement Learning | 2.5 K | ICLR | 2017 | Barret Zoph | |
- | Efficient Neural Architecture Search via Parameter Sharing | 1.2 K | ICML | 2018 | Hieu Pham | |
- | SNAS: Stochastic Neural Architecture Search | 0.3 K | ICLR | 2019 | Sirui Xie | SenseTime |
PC-Darts | PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search | 159 | ICLR | 2020 | Yuhui Xu | Huawei |
Abbreviation | Paper | Cited By | Journal | Year | 1st Author | 1st Affiliation |
---|---|---|---|---|---|---|
CBDNet | Toward Convolutional Blind Denoising of Real Photographs | 0.2 K | CVPR | 2019 | Shi Guo | HIT |
- | Learning Deep CNN Denoiser Prior for Image Restoration | 0.8 K | CVPR | 2017 | Kai Zhang | HIT |
CnDNN | Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising | 2.9 K | TIP | 2017 | Kai Zhang | HIT |
FFDNet | FFDNet: Toward a fast and flexible solution for CNN based image denoising | 0.6 K | TIP | 2018 | Kai Zhang | HIT |
SRMD | Learning a Single Convolutional Super-Resolution Network for Multiple Degradations | 0.3 K | CVPR | 2018 | Kai Zhang | HIT |
RIDNet | Real Image Denoising with Feature Attention] | 87 | ICCV | 2019 | Saeed Anwar | CSIRO |
CycleISP | CycleISP: Real Image Restoration via Improved Data Synthesis | 28 | CVPR | 2020 | Syed Waqas Zamir | UAE |
AINDNet | Transfer Learning from Synthetic to Real-Noise Denoising with Adaptive Instance Normalization | 14 | CVPR | 2020 | Yoonsik Kim | Seoul National University |
Abbreviation | Paper | Cited by | Journal | Year | 1st Author | 1st Affiliation |
---|---|---|---|---|---|---|
Image Transformer | Image Transformer | 337 | ICML | 2018 | Niki Parmar | |
- | Attention Augmented Convolutional Networks | 191 | ICCV | 2019 | Irwan Bello | |
DETR | End-to-End Object Detection with Transformers | 252 | ECCV | 2020 | Nicolas Carion | Facebook AI |
Deit | Training data-efficient image transformers & distillation through attention | 57 | arXiv | 2020 | Hugo Touvron | FAIR |
i-GPT | Generative Pretraining from Pixels | 38 | ICML | 2020 | Mark Chen | OpenAI |
Deformable DETR | Deformable DETR: Deformable Transformers for End-to-End Object Detection | 12 | ICLR | 2021 | Xizhou Zhu | SenseTime |
- | Training data-efficient image transformers & distillation through attention | 57 | Arxiv | 2020 | Hugo Touvron | FAIR |
ViT | An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale | 175 | ICLR | 2021 | Alexey Dosovitskiy | |
IPT | Pre-Trained Image Processing Transformer | 16 | CVPR | 2021 | Hanting Chen | Huawei Noah |
- | A Survey on Visual Transformer | 12 | Arxiv | 2021 | Kai Han | Huawei Noah |
TNT | Transformer in Transformer | 8 | Arxiv | 2021 | Kai Han | Huawei Noah |
...... |
Abbreviation | Paper | Cited by | Journal | Year | 1st Author | 1st Affiliation |
---|---|---|---|---|---|---|
Transformer | Attention Is All You Need | 19 K | NIPS | 2017 | Ashish Vaswani | |
- | Self-Attention with Relative Position Representations | 0.5 K | NAACL | 2018 | Peter Shaw | |
Bert | BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | 17 K | NAACL | 2019 | Jacob Devlin |
......
Thanks for the materias and help from Aidong Men, Bo Yang, Zhuqing Jiang, Qishuo Lu, Zhengxin Zeng, Jia'nan Han, Pengliang Tang, Yiyun Zhao, Xian Zhang ......