A MNIST CNN developed to recognize handwritten digits; grayscale input channel of 1 accepted, 1x28x28
input.
The CNN is trained in batches of 32
from the DB. Each of the 32 filters is a 2D 3x3
and extracted from the 1x28x28
A Visual of an MNIST CNN
To train the CNN,
poetry run python src/train_model.py
This will contact the MNIST DB & run 10 epochs to train the model. The appropriate state after the model is finished executing will be created
as the digital_model.pt
file.
Once completed, run the model with the binary generated using
poetry run python src/main.py
For the purpose of MNIST, the Adam optimizer with lr=1e-3
performs best
Starting with Stochastic Gradient Descent
Where
The Adam optimizer redefines SGD's params as such:
For this project,
Nesterov's Momentum Acceleration
The loss / cost after Epoch 10
ended at ~ 0.01253
on average.
I am the sole contributor of this project.
This project is licensed under the MIT License - see the LICENSE.md