Questions about inference with SCFRD in TensorRT
lpkoh opened this issue · comments
Hi,
I am trying to convert SCFRD from Pytorch to TensorRT and run inference, and I am using this repo as reference. I am looking at #37 as a similar, previous question.
First question:
I converted SCFRD to TensorRT and the bindings that I obtained were below:
Only the first is the input binding. Could I clarify:
- What is the difference between (1, x, 1) and (1, x, 4). The previous post seems to describe it as anchor center vs bbos prediction. If it is bbox prediction, could I ask where is the objectness/class score, as is typical?
- Why are there 3 values of x?
Second question:
Would I need to pre process the image by subtracting the mean of 127.5 and standardizing by 128? My original weights and scfrd2onnx script is from https://github.com/deepinsight/insightface
-
Hi! Actually 1,x,1 is scores, while 1,x,4 is bboxes.
Anchor centers are calculated once for desired resolution and it's used internally to convert bbox predictions to actual coordinates.
There are three set of arrays because scrfd uses the strides for detection (same as retina), which are basically a set of three scales used for detection. Each scale produces it's own predictions which are later filtered by score and NMS. -
Yes, you need to preprocess image.
Ah thank you so much for clarification!
I was looking through the repo and I couldn't find a script to filter the predictions by score and NMS. I was wondering if you have that available for reference? Also, all three scales's predictions are to be taken together right?