Deep Learning Papers

Papers about deep learning ordered by task, date. Current state-of-the-art papers are labelled.

Object Recognition

Aggregated Residual Transformations for Deep Neural Networks, nov 2016, arxiv
Hierarchical Object Detection with Deep Reinforcement Learning, nov 2016, arxiv
Learning to Make Better Mistakes: Semantics-aware Visual Food Recognition, okt 2016, IBM, paper
T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos, aug 2016, github, arxiv
Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, aug 2016, Google, arxiv
Residual Networks of Residual Networks: Multilevel Residual Networks, aug 2016, arxiv
Context Matters: Refining Object Detection in Video with Recurrent Neural Networks, jul 2016, arxiv
Training Region-based Object Detectors with Online Hard Example Mining, apr 2016, Facebook, arxiv
Deep Residual Learning for Image Recognition, dec 2015, arxiv
SSD: Single Shot MultiBox Detector, dec 2015, Google, github, arxiv
ParseNet: Looking Wider to See Better, jun 2015, arxiv
You Only Look Once: Unified, Real-Time Object Detection, jun 2015, Facebook, arxiv
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, jun 2015, Microsoft/Facebook arxiv
Selective Search for Object Recognition, 2012, paper
Rich feature hierarchies for accurate object detection and semantic segmentation, 2014, paper

Pose Estimation

Fast Single Shot Detection and Pose Estimation, sep 2016, arxiv

Face Recognition

Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition, paper
OpenFace: A general-purpose face recognition library with mobile applications, June 2016, paper
Deep Face Recognition, 2015, paper
Compact Convolutional Neural Network Cascade for Face Detection, aug 2015, arxiv
Learning Robust Deep Face Representation, Jul 2015, arxiv
FaceNet: A Unified Embedding for Face Recognition and Clustering, jun 2015, paper
Multi-view Face Detection Using Deep Convolutional Neural Networks, yahoo, feb 2015, arxiv

Style Transfer

A learned representation for artistic style, okt 2016, Google, arxiv, demo
Fast Style Transfer in TensorFlow, github
Instance Normalization: The Missing Ingredient for Fast Stylization, sept 2016, arxiv
A Neural Algorithm of Artistic Style, sept 2015, arxiv
Perceptual Losses for Real-Time Style Transfer and Super-Resolution, arxiv, github

Logo Recognition

Automatic Graphic Logo Detection via Fast Region-based Convolutional Networks, apr 2016, arxiv
Logo Localization and Recognition in Natural Images Using Homographic Class Graphs, 2016, paper
LOGO-Net: Large-scale Deep Logo Detection and Brand Recognition with Deep Region-based Convolutional Networks, nov 2015, arxiv
DeepLogo: Hitting Logo Recognition with the Deep Neural Network Hammer, okt 2015, Berkely, arxiv
Automatic detection of logos in video and their removal using inpainting, jul 2015, paper
On the Benefit of Synthetic Data for Company Logo Detection, 2015, paper
Fast and Robust Realtime Storefront Logo Recognition, paper
Scalable Logo Recognition in Real-World Images, 2011, paper
https://arxiv.org/pdf/1609.01414v1.pdf

note: also includes some papers that use SIFT

Text (in the Wild) Recognition

COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images, jun 2016, arxiv
Recursive Recurrent Nets with Attention Modeling for OCR in the Wild, mar 2016, arxiv
Efficient Scene Text Localization and Recognition with Local Character Refinement, apr 2015, arxiv
Reading Text in the Wild with Convolutional Neural Networks, dec 2014, arxiv
Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition, jun 2014, arxiv
Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning, 2011, paper

Image / Video Description

Generation and Comprehension of Unambiguous Object Descriptions, apr 2016, arxiv
Long-term Recurrent Convolutional Networks for Visual Recognition and Description, may 2016, arxiv

Detect key actor

Detecting events and key actors in multi-person videos, mar 2015, arxiv

ConvNet visualization

Visualizing and Understanding Convolutional Networks, Nov 2013, arxiv

Image Segmentation

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation, dec 2015, arxiv

Object part detection

Discovering the physical parts of an articulated object class from multiple videos, 2016, paper

Pedestrian Detection

Joint Deep Learning for Pedestrian Detection, 2013, paper

Lip Reading

Lip Reading in the Wild, 2016, Oxford, paper

Super Resolution

RAISR: Rapid and Accurate Image Super Resolution, okt 2016, Google, arxiv
Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network, sep 2016, Twitter, arxiv

Image Compression

Full Resolution Image Compression with Recurrent Neural Networks, aug 2016, Google arxiv

Automated Theorem Proving

DeepMath - Deep Sequence Models for Premise Selection, jun 2016, Google arxiv

Reverse Engineering

Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data, okt 2016, arxiv
Stealing Machine Learning Models via Prediction APIs, aug 2016, paper

Language

Rationalizing Neural Predictions github, arxiv

Translation

Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation, nov 2016, Google, arxiv
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation, okt 2016, Google arxiv

Other

Architecture and optimization

RenderGAN: Generating Realistic Labeled Data, nov 2016, arxiv
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding, feb 2016, arxiv
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size, feb 2016, arxiv
Snapshot Ensembles: Train 1, Get M for Free, 2016, paper, github

Tools for Deep Learning

Barrista github
Deep Learning 4 J github
Caffe: Convolutional Architecture for Fast Feature Embedding, jun 2014, github, arxiv

Tools for Papers

http://www.arxiv-sanity.com/

Data Sets

MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition, jul 2016, arxiv
Family in the Wild (FIW): A Large-scale Kinship Recognition Database, apr 2016, arxiv
https://github.com/openimages/dataset
YouTube-8M: A Large-Scale Video Classification Benchmark, sep 2016, Google, arxiv

Uncategorized

https://github.com/blue-yonder/tsfresh

MahmoudTamam / deep-learning-papers

Deep Learning Papers

Object Recognition

Pose Estimation

Face Recognition

Style Transfer

Logo Recognition

Text (in the Wild) Recognition

Image / Video Description

Detect key actor

ConvNet visualization

Image Segmentation

Object part detection

Pedestrian Detection

Lip Reading

Super Resolution

Image Compression

Automated Theorem Proving

Reverse Engineering

Language

Translation

Other

Architecture and optimization

Tools for Deep Learning

Tools for Papers

Data Sets

Uncategorized

About