alexnet alexnet-pytorch deep-learning neural-networks cifar10 cifar10-classification transfer-learning pytorch

AlexNet Implementation

A Transfer Learning Approach using Pytorch

The Evolving amount of Data and processing level of GPU's helped the researchers in the field of Deep Learning to perform better computations using the largely available data in order to produce better results regarding the tasks of Deep Learning like Compter Vision and Natural Language Processing.

One such evolution in the field of Computer Vision is AlexNet - ImageNet Classification with Deep Convolutional Neural Networks

This architecture was designed in the year 2012 by Alex Krizhevsky in collaboration with his Ph.D Advisor - Geoffrey Hinton and Ilya Sutskever with 89098 citations as of today. It competed in ILSVRC'2010 and ILSVRC'2012.

This paper is considered as one of the most influential paper in the field of Computer Vision. The architecture of the model is comparitively similar to that of LeNet with some additional depth of layers and regularization method called Dropout which helps in reducing the effect of overfitting. This paper provides an intuition about working on Deep Convolutional Layers along with usage of Non-Saturating non-linearity called as ReLU and regularizations like Data Augmentation and Dropout.

Task

To predict the class label of an image given as input from the provided dataset (CIFAR-10).

Datasets

CIFAR-10 Dataset

Download it from here

This Dataset involves 50000 training samples and 10000 testing samples classified into 10 different classes.

Each image is a 3-channeled sample (RGB)

Requirements

Python >= 3.0

PyTorch Version >= 0.4.0

torchvision >= 0.2.1

Architecture

Consists of 8 Layers - 5 Convolutional Layers + 3 Fully-Connected Layers

Number of Image Channels = 3

Activation = ReLU

256x256 Input Size (Resized to 224x224 during preprocessing)

Features

Convolutional Layer - Feature Maps : 64, Kernel Size : 11x11, Stride : 4, Padding : 2

ReLU Activation

Max Pooling layers - Kernel Size : 3x3, Stride : 2

Convolutional Layer - Feature Maps : 192, Kernel Size : 5x5, Padding : 2

ReLU Activation

Max Pooling layers - Kernel Size : 3x3, Stride : 2

Convolutional Layer - Feature Maps : 384, Kernel Size : 3x3, Padding : 1

ReLU Activation

Convolutional Layer - Feature Maps : 256, Kernel Size : 3x3, Padding : 1

ReLU Activation

Convolutional Layer - Feature Maps : 256, Kernel Size : 11x11, Padding : 1

ReLU Activation

Max Pooling layers - Kernel Size : 3x3, Stride : 2

FLATTEN

Classifier

Dropout - 0.5 (Probability of Dropping Neurons)

Fully Connected - 9216 --> 4096

ReLU Activation

Dropout - 0.5

Fully Connected - 4096 --> 1024

ReLU Activation

Fully Connected - 1024 --> 10

NOTE - In the Classifier, Second fully connected layer is modified from 4096 --> 4096 to 4096 --> 1024 in order to reduce overfitting and heavy losses during training as it is being trained for the first time on the data producing 10 classes instead of 1000 in case of ImageNet.