scrfd_ * shape issue

Question

scrfd_ * shape issue

gulldan opened this issue 3 years ago · comments

Hello,

for models scrfd_* if i try out "test_images/Stallone.jpg"

docker run -p 18080:18080 -it --gpus '"device=0"' -e LOG_LEVEL=INFO -e PYTHONUNBUFFERED=0 -e
NUM_WORKERS=1 -e INFERENCE_BACKEND=trt -e FORCE_FP16=True -e DET_NAME=scrfd_10g_gnkps -e DET_THRESH=0.6 -e REC_NAME=glint36
0k_r100FC_1.0 -e REC_IGNORE=False -e REC_BATCH_SIZE=1 -e GA_NAME=genderage_v1 -e GA_IGNORE=False -e KEEP_ALL=True -e MAX_SIZE
=1024,780 -e DEF_RETURN_FACE_DATA=True -e DEF_EXTRACT_EMBEDDING=True -e DEF_EXTRACT_GA=True -e DEF_API_VER='1' --mount type=b
ind,source=/home/work/services/models,target=/models --health-cmd='curl -f http://localhost:18080/info || exit 1' --health-i
nterval=1m --health-timeout=10s --health-retries=3 insightface-rest
Preparing models...
mxnet version: 1.8.0
onnx version: 1.7.0
[08:23:40] INFO - Preparing 'glint360k_r100FC_1.0' model...
[08:23:40] INFO - Preparing 'scrfd_10g_gnkps' model...
[08:23:40] INFO - Reshaping ONNX inputs to: (1, 3, 780, 1024)
[08:23:40] INFO - Building TRT engine for scrfd_10g_gnkps...
[08:23:52] INFO - Building TensorRT engine with FP16 support.
[TensorRT] WARNING: /workspace/TensorRT/t/oss-cicd/oss/parsers/onnx/onnx2trt_utils.cpp:227: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[TensorRT] WARNING: No implementation obeys reformatting-free rules, at least 10 reformatting nodes are needed, now picking the fastest path instead.
[08:24:46] INFO - Building TRT engine complete!
[08:24:46] INFO - Preparing 'genderage_v1' model...
Starting InsightFace-REST using 1 workers.
mxnet version: 1.8.0
onnx version: 1.7.0
[08:24:49] INFO - Warming up face detector TensorRT engine...
[08:25:02] INFO - Engine warmup complete! Expecting input shape: (1, 3, 780, 1024)
[08:25:02] INFO - Warming up ArcFace TensorRT engine...
[08:25:03] INFO - Engine warmup complete! Expecting input shape: (1, 3, 112, 112). Max batch size: 1
[08:25:03] INFO - Warming up GenderAge TensorRT engine...
[08:25:03] INFO - Engine warmup complete! Expecting input shape: (1, 3, 112, 112). Max batch size: 1
INFO: Started server process [233]
[08:25:03] INFO - Started server process [233]
INFO: Waiting for application startup.
[08:25:03] INFO - Waiting for application startup.
INFO: Application startup complete.
[08:25:03] INFO - Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:18080 (Press CTRL+C to quit)
[08:25:03] INFO - Uvicorn running on http://0.0.0.0:18080 (Press CTRL+C to quit)
INFO: 127.0.0.1:35790 - "GET /info HTTP/1.1" 200 OK
INFO: 192.168.49.168:65047 - "GET /docs HTTP/1.1" 200 OK
INFO: 192.168.49.168:65047 - "GET /openapi.json HTTP/1.1" 200 OK
inference cost: 0.0219266414642334
Traceback (most recent call last):
File "./modules/processing.py", line 236, in embed
faces = await self.model.get(image, max_size=max_size, threshold=threshold,
File "./modules/face_model.py", line 189, in get
boxes, probs, landmarks, mask_probs = self.det_model.detect(img.transformed_image, threshold=threshold)
File "./modules/face_model.py", line 40, in detect
bboxes, landmarks = self.retina.detect(data, threshold=threshold)
File "./modules/model_zoo/detectors/scrfd.py", line 173, in detect
scores_list, bboxes_list, kpss_list = self.forward(img, threshold)
File "./modules/model_zoo/detectors/scrfd.py", line 159, in forward
bboxes = distance2bbox(anchor_centers, bbox_preds)
File "./modules/model_zoo/detectors/scrfd.py", line 39, in distance2bbox
x1 = points[:, 0] - distance[:, 0]
ValueError: operands could not be broadcast together with shapes (24832,) (25088,)

what could be a solution?
NVIDIA-SMI 465.19.01
Driver Version: 465.19.01
CUDA Version: 11.3
NVIDIA Tesla T4

if change to retinaface_r50_v1 everything works fine.

i try to change base container to nvcr.io/nvidia/tensorrt:21.06-py3 but nothing changes

Preparing models...
mxnet version: 1.8.0
onnx version: 1.7.0
[07:53:24] INFO - Preparing 'glint360k_r100FC_1.0' model...
[07:53:24] INFO - Preparing 'scrfd_10g_gnkps' model...
[07:53:24] INFO - Reshaping ONNX inputs to: (1, 3, 780, 1024)
[07:53:24] INFO - Building TRT engine for scrfd_10g_gnkps...
[07:53:36] INFO - Building TensorRT engine with FP16 support.
[TensorRT] WARNING: /home/jenkins/agent/workspace/OSS/OSS_L0_MergeRequest/oss/parsers/onnx/onnx2trt_utils.cpp:271: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[TensorRT] WARNING: No implementation of layer InstanceNormalization_130 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[TensorRT] WARNING: No implementation of layer InstanceNormalization_177 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[TensorRT] WARNING: No implementation of layer InstanceNormalization_141 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[TensorRT] WARNING: No implementation of layer InstanceNormalization_224 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[TensorRT] WARNING: No implementation of layer InstanceNormalization_188 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[TensorRT] WARNING: No implementation of layer InstanceNormalization_152 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[TensorRT] WARNING: No implementation of layer InstanceNormalization_235 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[TensorRT] WARNING: No implementation of layer InstanceNormalization_199 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[TensorRT] WARNING: No implementation of layer InstanceNormalization_246 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[TensorRT] WARNING: No implementation obeys reformatting-free rules, at least 10 reformatting nodes are needed, now picking the fastest path instead.
[07:54:32] INFO - Building TRT engine complete!
[07:54:32] INFO - Preparing 'genderage_v1' model...
Starting InsightFace-REST using 1 workers.
mxnet version: 1.8.0
onnx version: 1.7.0
[07:54:36] INFO - Warming up face detector TensorRT engine...
[07:54:48] INFO - Engine warmup complete! Expecting input shape: (1, 3, 780, 1024)
[07:54:48] INFO - Warming up ArcFace TensorRT engine...
[07:54:49] INFO - Engine warmup complete! Expecting input shape: (1, 3, 112, 112). Max batch size: 1
[07:54:49] INFO - Warming up GenderAge TensorRT engine...
[07:54:49] INFO - Engine warmup complete! Expecting input shape: (1, 3, 112, 112). Max batch size: 1
INFO:     Started server process [233]
[07:54:49] INFO - Started server process [233]
INFO:     Waiting for application startup.
[07:54:49] INFO - Waiting for application startup.
INFO:     Application startup complete.
[07:54:49] INFO - Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:18080 (Press CTRL+C to quit)
[07:54:49] INFO - Uvicorn running on http://0.0.0.0:18080 (Press CTRL+C to quit)
INFO:     192.168.49.168:64156 - "GET /docs HTTP/1.1" 200 OK
INFO:     192.168.49.168:64156 - "GET /openapi.json HTTP/1.1" 200 OK
inference cost: 0.02373218536376953
Traceback (most recent call last):
  File "./modules/processing.py", line 236, in embed
    faces = await self.model.get(image, max_size=max_size, threshold=threshold,
  File "./modules/face_model.py", line 189, in get
    boxes, probs, landmarks, mask_probs = self.det_model.detect(img.transformed_image, threshold=threshold)
  File "./modules/face_model.py", line 40, in detect
    bboxes, landmarks = self.retina.detect(data, threshold=threshold)
  File "./modules/model_zoo/detectors/scrfd.py", line 173, in detect
    scores_list, bboxes_list, kpss_list = self.forward(img, threshold)
  File "./modules/model_zoo/detectors/scrfd.py", line 159, in forward
    bboxes = distance2bbox(anchor_centers, bbox_preds)
  File "./modules/model_zoo/detectors/scrfd.py", line 39, in distance2bbox
    x1 = points[:, 0] - distance[:, 0]
ValueError: operands could not be broadcast together with shapes (24832,) (25088,)

layer InstanceNormalization was fix in new TRT but there is no prebuild nvidia container yet

SthPhoenix · Answer 1 · Fri Jul 16 2021 01:26:00 GMT+0800 (China Standard Time)

Hi! For SCRFD family models input dimensions should be divisible by 32.
deepinsight/insightface#1578

So you should just change 1024,780 to 1024,768

so1 · Answer 2 · Fri Jul 16 2021 01:28:42 GMT+0800 (China Standard Time)

Hi! For SCRFD family models input dimensions should be divisible by 32.
deepinsight/insightface#1578

oh ,thanks

SthPhoenix · Answer 3 · Fri Jul 16 2021 01:32:44 GMT+0800 (China Standard Time)

Actually I was struggling with same issue when SCRFD was only released, I think I should mention this in readme or dockerfile comments