SthPhoenix / InsightFace-REST

InsightFace REST API for easy deployment of face recognition services with TensorRT in Docker.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

build successfully, but container didn't start

bltcn opened this issue · comments

commented

win11,wsl2,ubuntu18.04
微信图片_20211103184203
How can I deal with it?

Hi! I haven't tested image on windows.
Have you checked container logs?

commented

Preparing models...
[04:56:39] INFO - Preparing 'glintr100' model...
[04:56:39] INFO - Building TRT engine for glintr100...
[TensorRT] WARNING: onnx2trt_utils.cpp:362: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[TensorRT] WARNING: Detected invalid timing cache, setup a local cache instead
[TensorRT] WARNING: GPU error during getBestTactic: Conv_0 : invalid argument
[TensorRT] ERROR: 10: [optimizer.cpp::computeCosts::1855] Error Code 10: Internal Error (Could not find any implementation for node Conv_0.)
Traceback (most recent call last):
File "prepare_models.py", line 54, in
prepare_models()
File "prepare_models.py", line 49, in prepare_models
prepare_backend(model_name=model, backend_name=backend_name, im_size=max_size, force_fp16=force_fp16,
File "/app/modules/model_zoo/getter.py", line 157, in prepare_backend
convert_onnx(temp_onnx_model,
File "/app/modules/converters/onnx_to_trt.py", line 84, in convert_onnx
assert not isinstance(engine, type(None))
AssertionError
Starting InsightFace-REST using 1 workers.
[04:56:51] INFO - 1
[04:56:51] INFO - MAX_BATCH_SIZE: 1
[04:56:51] INFO - Reshaping ONNX inputs to: (1, 3, 640, 640)
[04:56:51] INFO - In shape: [dim_value: 1
, dim_value: 3
, dim_param: "?"
, dim_param: "?"
]
[04:56:51] INFO - Building TRT engine for scrfd_10g_gnkps...
[TensorRT] WARNING: onnx2trt_utils.cpp:362: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[TensorRT] WARNING: Detected invalid timing cache, setup a local cache instead
[TensorRT] WARNING: GPU error during getBestTactic: Conv_0 + Relu_1 : invalid argument
[TensorRT] ERROR: 10: [optimizer.cpp::computeCosts::1855] Error Code 10: Internal Error (Could not find any implementation for node Conv_0 + Relu_1.)
Traceback (most recent call last):
File "/usr/local/bin/uvicorn", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1128, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/uvicorn/main.py", line 425, in main
run(app, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/uvicorn/main.py", line 447, in run
server.run()
File "/usr/local/lib/python3.8/dist-packages/uvicorn/server.py", line 68, in run
return asyncio.run(self.serve(sockets=sockets))
File "/usr/lib/python3.8/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/usr/local/lib/python3.8/dist-packages/uvicorn/server.py", line 76, in serve
config.load()
File "/usr/local/lib/python3.8/dist-packages/uvicorn/config.py", line 448, in load
self.loaded_app = import_from_string(self.app)
File "/usr/local/lib/python3.8/dist-packages/uvicorn/importer.py", line 21, in import_from_string
module = importlib.import_module(module_str)
File "/usr/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1014, in _gcd_import
File "", line 991, in _find_and_load
File "", line 975, in _find_and_load_unlocked
File "", line 671, in _load_unlocked
File "", line 848, in exec_module
File "", line 219, in _call_with_frames_removed
File "/app/./app.py", line 36, in
processing = Processing(det_name=configs.models.det_name, rec_name=configs.models.rec_name,
File "/app/./modules/processing.py", line 180, in init
self.model = FaceAnalysis(det_name=det_name, rec_name=rec_name, ga_name=ga_name, device=device,
File "/app/./modules/face_model.py", line 78, in init
self.det_model = Detector(det_name=det_name, device=device, max_size=self.max_size,
File "/app/./modules/face_model.py", line 37, in init
self.retina = get_model(det_name, backend_name=backend_name, force_fp16=force_fp16, im_size=max_size,
File "/app/./modules/model_zoo/getter.py", line 203, in get_model
model_path = prepare_backend(model_name, backend_name, im_size=im_size, max_batch_size=max_batch_size,
File "/app/./modules/model_zoo/getter.py", line 157, in prepare_backend
convert_onnx(temp_onnx_model,
File "/app/./modules/converters/onnx_to_trt.py", line 84, in convert_onnx
assert not isinstance(engine, type(None))
AssertionError

Have you tried running other GPU based containers on wsl2, like TensorFlow benchmarks, to verify your wsl2 is properly configured for GPU usage?

commented

Run "nbody -benchmark [-numbodies=]" to measure performance.
-fullscreen (run n-body simulation in fullscreen mode)
-fp64 (use double precision floating point values for simulation)
-hostmem (stores simulation data in host memory)
-benchmark (run benchmark to measure performance)
-numbodies= (number of bodies (>= 1) to run in simulation)
-device= (where d=0,1,2.... for the CUDA device to use)
-numdevices= (where i=(number of CUDA devices > 0) to use for simulation)
-compare (compares simulation results running once on the default GPU and once on the CPU)
-cpu (run n-body simulation on the CPU)
-tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Windowed mode
Simulation data stored in video memory
Single precision floating point simulation
1 Devices used for simulation
GPU Device 0: "Pascal" with compute capability 6.1

Compute 6.1 CUDA device: [NVIDIA GeForce GTX 1060]
10240 bodies, total time for 10 iterations: 8.868 ms
= 118.245 billion interactions per second
= 2364.896 single-precision GFLOP/s at 20 flops per interaction

Hm, then TensorRT should work as expected.

I can double check that latest published version of InsightFace-REST works out of the box, but unfortunately I can't help you with running it on Windows.

I have checked building from scratch with clean clone from repo - everything works as intended on Ubuntu 20.04.

Looks like it's WSL related problem.

commented

thanks, I have tested cpu version. it works fine.maybe there is somthing wrong with parameters in this case

Quote from Nvidia page above:

With the NVIDIA Container Toolkit for Docker 19.03, only --gpus all is supported.

This might be the case, since deploy_trt.sh tries to set specific GPU. Try replacing line 99 with --gpus all

Though according to the same document there also might be issues with pinned memory required for TensorRT, and issues with concurrent CUDA streams.

If pinned memory is also the issue you can try add RUN $PIP_INSTALL onnxruntime-gpu to Dockerfile_trt and switch inference backend to onnx in deploy_trt.sh at line 105

commented

thanks. i will try

Hi! Any updates? Have you managed to run it under WSL2?

commented

sorry,I just see your reply. I will try.

sorry,I just see your reply. I will try.

Looks like WSL2 just wasn't supported by TensorRT, but according to change log latest TensorRT version should support it. Try using 21.12 TensorRT image.

sorry,I just see your reply. I will try.

Looks like WSL2 just wasn't supported by TensorRT, but according to change log latest TensorRT version should support it. Try using 21.12 TensorRT image.

i try 21.12 and 22.01TensorRT image, unfortunately,all failed. 21.12 report GPU error during getBestTactic, 22.01 report Cuda failure: integrity checks failed

i try 21.12 and 22.01TensorRT image, unfortunately,all failed. 21.12 report GPU error during getBestTactic, 22.01 report Cuda failure: integrity checks failed

Have you tried running other GPU based containers on WSL2 to ensure everything is installed correctly?