SthPhoenix / InsightFace-REST

InsightFace REST API for easy deployment of face recognition services with TensorRT in Docker.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Questions about inference with SCFRD in TensorRT

lpkoh opened this issue · comments

commented

Hi,

I am trying to convert SCFRD from Pytorch to TensorRT and run inference, and I am using this repo as reference. I am looking at #37 as a similar, previous question.

First question:
I converted SCFRD to TensorRT and the bindings that I obtained were below:
image
Only the first is the input binding. Could I clarify:

  • What is the difference between (1, x, 1) and (1, x, 4). The previous post seems to describe it as anchor center vs bbos prediction. If it is bbox prediction, could I ask where is the objectness/class score, as is typical?
  • Why are there 3 values of x?

Second question:
Would I need to pre process the image by subtracting the mean of 127.5 and standardizing by 128? My original weights and scfrd2onnx script is from https://github.com/deepinsight/insightface

  1. Hi! Actually 1,x,1 is scores, while 1,x,4 is bboxes.
    Anchor centers are calculated once for desired resolution and it's used internally to convert bbox predictions to actual coordinates.
    There are three set of arrays because scrfd uses the strides for detection (same as retina), which are basically a set of three scales used for detection. Each scale produces it's own predictions which are later filtered by score and NMS.

  2. Yes, you need to preprocess image.

commented

Ah thank you so much for clarification!

I was looking through the repo and I couldn't find a script to filter the predictions by score and NMS. I was wondering if you have that available for reference? Also, all three scales's predictions are to be taken together right?

This is my scrfd implementation: scrfd.py
And this is nms

Yes, all three scales are used to select best candidates with help of thresholding and NMS.