SthPhoenix / InsightFace-REST

InsightFace REST API for easy deployment of face recognition services with TensorRT in Docker.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Gpu Quatro RTX 5000 error

MyraBaba opened this issue · comments

Hi,

I have below error in our lenova laptop which has Quatro RTX 5000 nvidia GPU:

Starting 1 workers on 1 GPUs (1 workers per GPU)
Containers port range: 18081 - 18081
insightface-rest-gpu0-trt
--- Starting container insightface-rest-gpu0-trt with "device=0" at port 18081
Preparing models...
[14:40:15] INFO - Preparing 'scrfd_10g_gnkps' model...
[10/13/2023-14:40:15] [TRT] [W] Unable to determine GPU memory usage
[10/13/2023-14:40:15] [TRT] [W] Unable to determine GPU memory usage
[10/13/2023-14:40:15] [TRT] [W] CUDA initialization failure with error: 999. Please check your CUDA installation: http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
Traceback (most recent call last):
File "/app/prepare_models.py", line 53, in
prepare_models()
File "/app/prepare_models.py", line 45, in prepare_models
prepare_backend(model_name=model, backend_name=settings.models.inference_backend, im_size=max_size,
File "/app/modules/model_zoo/getter.py", line 137, in prepare_backend
has_fp16 = check_fp16()
File "/app/modules/converters/onnx_to_trt.py", line 66, in check_fp16
builder = trt.Builder(TRT_LOGGER)
TypeError: pybind11::init(): factory function returned nullptr
Starting InsightFace-REST using 1 workers.
[2023-10-13 14:40:15 +0000] [1] [INFO] Starting gunicorn 21.2.0
[2023-10-13 14:40:15 +0000] [1] [INFO] Listening at: http://0.0.0.0:18080 (1)
[2023-10-13 14:40:15 +0000] [1] [INFO] Using worker: uvicorn.workers.UvicornWorker
[2023-10-13 14:40:15 +0000] [41] [INFO] Booting worker with pid: 41
[10/13/2023-14:40:16] [TRT] [W] Unable to determine GPU memory usage
[10/13/2023-14:40:16] [TRT] [W] Unable to determine GPU memory usage
[10/13/2023-14:40:16] [TRT] [W] CUDA initialization failure with error: 999. Please check your CUDA installation: http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
[2023-10-13 14:40:16 +0000] [41] [ERROR] Exception in worker process
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/gunicorn/arbiter.py", line 609, in spawn_worker
worker.init_process()
File "/usr/local/lib/python3.10/dist-packages/uvicorn/workers.py", line 66, in init_process
super(UvicornWorker, self).init_process()
File "/usr/local/lib/python3.10/dist-packages/gunicorn/workers/base.py", line 134, in init_process
self.load_wsgi()
File "/usr/local/lib/python3.10/dist-packages/gunicorn/workers/base.py", line 146, in load_wsgi
self.wsgi = self.app.wsgi()
File "/usr/local/lib/python3.10/dist-packages/gunicorn/app/base.py", line 67, in wsgi
self.callable = self.load()
File "/usr/local/lib/python3.10/dist-packages/gunicorn/app/wsgiapp.py", line 58, in load
return self.load_wsgiapp()
File "/usr/local/lib/python3.10/dist-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp
return util.import_app(self.app_uri)
File "/usr/local/lib/python3.10/dist-packages/gunicorn/util.py", line 371, in import_app
mod = importlib.import_module(module)
File "/usr/lib/python3.10/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 688, in _load_unlocked
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/app/app.py", line 34, in
processing = Processing(det_name=settings.models.det_name, rec_name=settings.models.rec_name,
File "/app/modules/processing.py", line 32, in init
self.model = FaceAnalysis(det_name=det_name,
File "/app/modules/face_model.py", line 98, in init
self.det_model = Detector(det_name=det_name, max_size=self.max_size,
File "/app/modules/face_model.py", line 56, in init
self.retina = get_model(det_name, backend_name=backend_name, force_fp16=force_fp16, im_size=max_size,
File "/app/modules/model_zoo/getter.py", line 213, in get_model
model_path = prepare_backend(model_name, backend_name, im_size=im_size, max_batch_size=max_batch_size,
File "/app/modules/model_zoo/getter.py", line 137, in prepare_backend
has_fp16 = check_fp16()
File "/app/modules/converters/onnx_to_trt.py", line 66, in check_fp16
builder = trt.Builder(TRT_LOGGER)
TypeError: pybind11::init(): factory function returned nullptr
[2023-10-13 14:40:16 +0000] [41] [INFO] Worker exiting (pid: 41)
[2023-10-13 14:40:16 +0000] [1] [ERROR] Worker (pid:41) exited with code 3
[2023-10-13 14:40:16 +0000] [1] [ERROR] Shutting down: Master
[2023-10-13 14:40:16 +0000] [1] [ERROR] Reason: Worker failed to boot.

2 - Can we use the repo in ubuntu 18.04 and | NVIDIA-SMI 460.27.04 Driver Version: 460.27.04 CUDA Version: 11.2 |

This error seems to be a result of some driver misconfiguration\mismatch.
I haven't tested this repo on Ubuntu prior to 20.04, so I can't guarantee it would work.