izzudin01 / digitrecognizer

This project involves the development of a digit recognition system using a two-layer neural network, specifically designed to classify handwritten digits (0-9). The system was built and trained on the MNIST dataset, which contains 70,000 images of handwritten digits.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

digitrecognizer

This project involves the development of a digit recognition system using a two-layer neural network, specifically designed to classify handwritten digits (0-9). The system was built and trained on the MNIST dataset, which contains 70,000 images of handwritten digits.

Dataset link: https://www.kaggle.com/competitions/digit-recognizer/data

The Neural network will have 2 layer. The Input layer $a^{[0]}$ will have 784 unit reffering to the 784 pixels in each 28x28 input image. The idden layer $a^{[1]}$ will have 10 units with ReLU activation, and finally, the output layer $a^{[2]}$ will have 10 units referring to the ten digit classes with softmax activation.

Forward propagation

$$Z^{[1]} = W^{[1]} X + b^{[1]}$$ $$A^{[1]} = g_{\text{ReLU}}(Z^{[1]}))$$ $$Z^{[2]} = W^{[2]} A^{[1]} + b^{[2]}$$ $$A^{[2]} = g_{\text{softmax}}(Z^{[2]})$$

Backward propagation

$$dZ^{[2]} = A^{[2]} - Y$$ $$dW^{[2]} = \frac{1}{m} dZ^{[2]} A^{[1]T}$$ $$dB^{[2]} = \frac{1}{m} \Sigma {dZ^{[2]}}$$ $$dZ^{[1]} = W^{[2]T} dZ^{[2]} .* g^{[1]\prime} (z^{[1]})$$ $$dW^{[1]} = \frac{1}{m} dZ^{[1]} A^{[0]T}$$ $$dB^{[1]} = \frac{1}{m} \Sigma {dZ^{[1]}}$$

Parameter updates

$$W^{[2]} := W^{[2]} - \alpha dW^{[2]}$$ $$b^{[2]} := b^{[2]} - \alpha db^{[2]}$$ $$W^{[1]} := W^{[1]} - \alpha dW^{[1]}$$ $$b^{[1]} := b^{[1]} - \alpha db^{[1]}$$

Vars and shapes

Forward prop

  • $A^{[0]} = X$: 784 x m
  • $Z^{[1]} \sim A^{[1]}$: 10 x m
  • $W^{[1]}$: 10 x 784 (as $W^{[1]} A^{[0]} \sim Z^{[1]}$)
  • $B^{[1]}$: 10 x 1
  • $Z^{[2]} \sim A^{[2]}$: 10 x m
  • $W^{[1]}$: 10 x 10 (as $W^{[2]} A^{[1]} \sim Z^{[2]}$)
  • $B^{[2]}$: 10 x 1

Backprop

  • $dZ^{[2]}$: 10 x m ($~A^{[2]}$)
  • $dW^{[2]}$: 10 x 10
  • $dB^{[2]}$: 10 x 1
  • $dZ^{[1]}$: 10 x m ($~A^{[1]}$)
  • $dW^{[1]}$: 10 x 10
  • $dB^{[1]}$: 10 x 1

About

This project involves the development of a digit recognition system using a two-layer neural network, specifically designed to classify handwritten digits (0-9). The system was built and trained on the MNIST dataset, which contains 70,000 images of handwritten digits.


Languages

Language:Jupyter Notebook 100.0%