TSAI_S6 - PART 1

Excel Screenshot

Calculating derivatives/ Backpropagation

Architecture

Here we have 1 input layer, 1 hidden layer and 1 output layer. We are provided with inital weights for each layer.
In each layer starting from hidden layer, we have sigmoid function added to output for non linearity
Finally we have 2 output for 2 input which are E1 & E2. We have multiplied E1 & E2 with 1/2 to make calculation easy on calculating derivatives

Forward Propagation

We calculate output in each layer by multiplying with given weights and apply sigmoid wherever mentioned in architecture
E_total is sum of E1 & E2. Here t1 & t2 are two output or true labels

Backward Propagation

Here we have 2 layers - hidden and output layer so we will back propagate output E_total w.r.t to weights in two layer

For backpropagating w.r.t last layer we will calculate derivate of E_total w.r.t w5,w6,w7,w8

For backpropagating w.r.t hidden layer, we will calcualte derivate of E_total w.r.t w1,w2,w3,w4

Learning rate changes

lr=.1

lr=.2

lr=.5

lr=.8

lr=1

lr=2

As we increase the learning rate from 0.1 to 2 , we see that loss reduces drastically and tends to 0 in less iterations

TSAI_S6 - PART 2

Objective

Achieve 99.4% or more accuracy on mnist dataset with below constraints -

Less than 20k parameters

Less than 20 epochs

Use Batch Normalization

Use Dropout

Fully connected layer, GAP are optional

Architecture

We are using seven Convolution layer, two Max pooling layer, two Transition layer followed by Avg pooling layer.

Model Summary

Here we have used 18,738 parameters in total and which is aligned with constraint of using less than 20k parameters

Result

We managed to achieve 99.43% accuracy on validation dataset in epoch 18 althought after that it dropped by few decimals but this was acceptable criteria for the assignment

sahil0094 / TSAI_S6

TSAI_S6 - PART 1

Excel Screenshot

Calculating derivatives/ Backpropagation

Architecture

Forward Propagation

Backward Propagation

Learning rate changes

TSAI_S6 - PART 2

Objective

Architecture

Model Summary

Result

About

Languages