The tflite detection result converted from the yolo NAS model is inaccurate.

Question

The tflite detection result converted from the yolo NAS model is inaccurate.

serael opened this issue 7 months ago · comments

serael commented 7 months ago

Issue Type

Others

OS

Linux

onnx2tf version number

1.19.10

onnx version number

1.15.0

onnxruntime version number

1.16.3

onnxsim (onnx_simplifier) version number

0.4.33

tensorflow version number

2.15.0

Download URL for ONNX

https://drive.google.com/drive/folders/1fZX3cDqJW2I1NAiKnHip3_AWMlMbZHNz?usp=sharing

Parameter Replacement JSON

Description

This is a project to run object detection on an Android mobile and uses the yolo NAS model, which has excellent detection speed, by converting it to tflite.
Conversion was successful, but the detection results were not normal.
An ONNX model without quantization and NMS was created and tested.
This is an important issue where object detection is not possible when running a TFlite model on mobile.
"onnx2tf -i yolo_nas_l_ocr_nms.onnx -cotof" or "onnx2tf -i yolo_nas_l_ocr_nms.onnx -osd -cotof"
YOLO NAS ONNX model https://drive.google.com/drive/folders/1fZX3cDqJW2I1NAiKnHip3_AWMlMbZHNz?usp=sharing

hello
I am considering using yolo nas for object detection feature on android.
I tested it by converting tflite from onnx, but it was not detected, so I would like to ask for advice.

Since I am a beginner in deep learning, I did not attempt Parameter replacement and only used basic commands.

onnx result

onnx2tf -i yolo_nas_l_ocr_nms.onnx -cotof

Katsuya Hyodo · Answer 1 · Mon Feb 05 2024 14:33:51 GMT+0800 (China Standard Time)

You should write what kind of tests you did with tflite and what specific output you got. The conversion was successful and there is nothing I can advise you on due to the lack of information. I will not test your model by writing test code for each and every one of your models.

What test code was used, what test data was used, and what output was obtained. I am not a tester.

serael · Answer 2 · Mon Feb 05 2024 16:44:40 GMT+0800 (China Standard Time)

The tflite executable code is android java.

This is the code to check the result value of the execution code.

     tfLite.runForMultipleInputsOutputs(inputArray, outputMap);

    float[][] out_box = new float[8400][4];
    float[][] out_class = new float[8400][68];

    ByteBuffer byteBuffer_box = (ByteBuffer) outputMap.get(0);
    ByteBuffer byteBuffer_class = (ByteBuffer) outputMap.get(1);
    
      for (int j = 0; j < 8400; ++j) {
        for (int i = 0; i < 4; ++i) {
                out_box[j][i] = byteBuffer_box.getFloat();
        }
    }

    for (int j = 0; j < 8400; ++j) {
        for (int i = 0; i < 68; ++i) {
                out_class[j][i] = byteBuffer_class.getFloat();
        }
    }

 for (int i = 0; i < 8400; ++i){

        final float[] classes = new float[68];
        for (int c = 0; c < 68; ++c) {
            classes[c] = out_class [i][c];
        }
        
        
        for (int c = 0; c < 68; ++c) {
            if (classes[c] > maxClass) {
                detectedClass = c;
                maxClass = classes[c];
            }
        }

    }

There is no maxClass value greater than 0.1f.

The onnx detection result threshold value was set to 0.5f.

The data is image data and uses the input of [640x640x3] and the output of [8400,4] and [8400, 68].

output [8400, 4] is min x, min y, max x, max y.
[8400, 68] is the class probability.

This is a test image.

Katsuya Hyodo · Answer 3 · Mon Feb 05 2024 17:38:22 GMT+0800 (China Standard Time)

How is the pre-processing normalized?

If I have to repeat too many low level questions, I will close this issue.

serael · Answer 4 · Tue Feb 06 2024 08:29:15 GMT+0800 (China Standard Time)

Thank you for answer.

Since the input is an image, pre-processing is as follows.

bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight());

    // image square size, Width 640, Height 640
    INPUT_SIZE = 640;
    // coloer range 0~255
    IMAGE_STD = 255.0f;

    for (int i = 0; i < INPUT_SIZE; ++i) {
        for (int j = 0; j < INPUT_SIZE; ++j) {
            int pixelValue = intValues[i * INPUT_SIZE + j];
                imgData.putFloat(((pixelValue >> 16) & 0xFF)  / IMAGE_STD);
                imgData.putFloat(((pixelValue >> 8) & 0xFF)  / IMAGE_STD);
                imgData.putFloat((pixelValue & 0xFF)  / IMAGE_STD);
        }
    }

It is normalized to the color range of 255.

You can close the issue if you feel that you are requesting a low-level response.

Thank you for your time.

serael · Answer 5 · Wed Feb 07 2024 17:04:45 GMT+0800 (China Standard Time)

thank you

Based on your advice, I modified the input normalized range of ONNX and tflite and checked the results on Android.