Questions about inference with SCFRD in TensorRT

Question

Questions about inference with SCFRD in TensorRT

lpkoh opened this issue 3 years ago · comments

Hi,

I am trying to convert SCFRD from Pytorch to TensorRT and run inference, and I am using this repo as reference. I am looking at #37 as a similar, previous question.

First question:
I converted SCFRD to TensorRT and the bindings that I obtained were below:

Only the first is the input binding. Could I clarify:

What is the difference between (1, x, 1) and (1, x, 4). The previous post seems to describe it as anchor center vs bbos prediction. If it is bbox prediction, could I ask where is the objectness/class score, as is typical?
Why are there 3 values of x?

Second question:
Would I need to pre process the image by subtracting the mean of 127.5 and standardizing by 128? My original weights and scfrd2onnx script is from https://github.com/deepinsight/insightface

SthPhoenix · Answer 1 · Tue Mar 29 2022 01:58:36 GMT+0800 (China Standard Time)

Hi! Actually 1,x,1 is scores, while 1,x,4 is bboxes.
Anchor centers are calculated once for desired resolution and it's used internally to convert bbox predictions to actual coordinates.
There are three set of arrays because scrfd uses the strides for detection (same as retina), which are basically a set of three scales used for detection. Each scale produces it's own predictions which are later filtered by score and NMS.
Yes, you need to preprocess image.

lpkoh · Answer 2 · Tue Mar 29 2022 02:18:53 GMT+0800 (China Standard Time)

Ah thank you so much for clarification!

I was looking through the repo and I couldn't find a script to filter the predictions by score and NMS. I was wondering if you have that available for reference? Also, all three scales's predictions are to be taken together right?

SthPhoenix · Answer 3 · Tue Mar 29 2022 02:33:04 GMT+0800 (China Standard Time)

This is my scrfd implementation: scrfd.py
And this is nms

Yes, all three scales are used to select best candidates with help of thresholding and NMS.