githubharald / cpp_mnist

A neural network written in ~260 lines of plain C++ code without any dependencies.

Home Page:https://medium.com/@harald-scheidl/c-3cb420971abb

Repository from Github https://github.comgithubharald/cpp_mnistRepository from Github https://github.comgithubharald/cpp_mnist

Simple MNIST classifier in plain C++

A neural network written in ~260 lines of plain C++ code without any dependencies. The model is trained to distinguish the handwritten digits "0" and "1" from the MNIST dataset.

Training model...
Dataset size: 12665
Epoch: 0 Sample: 0 Loss: 2.44357
Epoch: 0 Sample: 1000 Loss: 0.230895
...
Epoch: 9 Sample: 12000 Loss: 0.00112786
Testing model...
Dataset size: 2115
Accuracy: 0.994326





              X
         XXXXXXXXXX
        XXXXXXXXXXXX
        XXX      XXXX
        XX        XXX
        XX         XXX
       XX          XXX
       XX           XX
       XX           XX
      XX            XX
      XX            XX
      XX            XX
      XX           XXX
      XX           XX
      XX          XXX
      XX         XXX
      XX       XXXX
      XXXXXXXXXXXXX
       XXXXXXXXXX
        XXXXXXX




Predicted: 0 (0.155461) Target: 0
Press ENTER to see next sample...






                X
               XXX
               XXX
              XXX
              XXX
             XXXX
             XXX
             XXX
             XXX
            XXX
            XXX
            XXX
           XXX
           XXX
           XXX
           XX
          XXX
          XXX
          XXX
           X



Predicted: 1 (0.962643) Target: 1
Press ENTER to see next sample...

How to run it

  • Unzip data.zip and make sure the unzipped files are in the same folder as the cpp file
  • Compile the C++ code, e.g. with g++ mnist.cpp on Linux, or by using Visual Studio on Windows
  • Run the program, e.g. by executing ./a.out on Linux
  • The model takes ~10s to train, then it gets evaluated on the testset (should get ~99% accuracy), and finally samples and their predictions are shown

Notes

  • The model is in fact a regression model (squared error loss and linear activation in final layer) to keep things as simple as possible, and is trained to output 0.0 for "0" and 1.0 for "1", however, raw outputs can also occur outside of that range
  • The RegressionModel class allows configuring the model (e.g., number of layers)
  • Good results are achieved with 2-4 layers and 5-15 unites per hidden layer
  • The backpropagation code follows the algorithm outlined in the Deep Learning book from Bishop
  • The dataset is created from the original MNIST dataset

About

A neural network written in ~260 lines of plain C++ code without any dependencies.

https://medium.com/@harald-scheidl/c-3cb420971abb


Languages

Language:C++ 100.0%