cbovar / ConvNetSharp

Deep Learning in C#

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using Model from MnistDemo

fdncred opened this issue · comments

I've followed the examples and have trained a MNIST model using the MnistDemo code with serializing the model to json. Now that I have a model I'd like to load the model and test it. I can load/deserialize the model just fine but how to I test it with 28x28 images that I have created separately.

What I'm not clear about is how to get the bitmap data in the format where I can use GetPrediction().

Any tips on how to do this?

Hi,

To evaluate the network on one image you need to create a Volume containing image data:

var input = BuilderInstance.Volume.From(new float[28*28], new Shape(28, 28, 1, 1));

for (var y = 0; y < 28; y++)
{
    for (var x = 0; x < 28; x++)
    {
        float pixel = /* float value between [0..1] representing the pixel (x,y)*/
        input.Set(x, y, 0, 0, pixel);
    }
}

And then feed it to the network to get the prediction:

net.Forward(input);
var prediction = this._net.GetPrediction();

I've found a project that does exactly that here.

This project (https://github.com/wyy272176594/dotnetcore-mnist) is cool: it uses ConvNetSharp to recognize digits from a photo, and it has a web front-end (https://www.wang-yueyang.com/mnist).

Thanks for the tip. One last thought. I'd like to be able to return the confidence level for each digit, 0 - 9. Right now, it just returns the predicted value without a confidence. Is there an easy way to return more than the top prediction and the confidence values for each prediction?

I created this method for anyone else exploring a similar situation.

unsafe private Volume<double> ConvertToVolume(Bitmap bmp)
{
    var vol = BuilderInstance.Volume.From(new double[bmp.Width * bmp.Height], new Shape(bmp.Width, bmp.Height, 1, 1));
    var data = bmp.LockBits(new Rectangle(0, 0, bmp.Width, bmp.Height), System.Drawing.Imaging.ImageLockMode.ReadOnly, bmp.PixelFormat);
    byte* src = (byte*)data.Scan0.ToPointer();
    var width = data.Width;
    var height = data.Height;
    int dst = 0;
    int pixelSize = Image.GetPixelFormatSize(bmp.PixelFormat) / 8;
    int offset = data.Stride - width * pixelSize;

    for (int y = 0; y < height; y++)
    {
        for (int x = 0; x < width; x++, dst++, src += pixelSize)
        {
            double pixel = *src / 255d;
            vol.Set(x, y, 0, 0, pixel);
        }
        src += offset;
    }
    bmp.UnlockBits(data);

    return vol;
}

If your last layer is a SoftmaxLayer, it already outputs confidence level for each class.

var output = net.Forward(input); # output will contains confidence levels for each class

Net.GetPrediction() is just a convenience method to return the prediction that has the highest confidence level. (see here)

This is what I get when I capture the output of net.Forward()

_storage	{double[10]}	double[]
[0]	0.19452938990722496	double
[1]	0.000796911776060907	double
[2]	0.0034073427213554609	double
[3]	0.016585338155885778	double
[4]	0.0010524275880460284	double
[5]	0.0010967844841814594	double
[6]	0.263684351908287	double
[7]	9.6973903125122763E-06	double
[8]	0.51641612475875909	double
[9]	0.0024216313098867608	double

which seems to match GetPrediction(), since it returns 8. Although the "real" answer is 0.

Thanks for your help. Hopefully I can get it to predict more accurately.