SthPhoenix / InsightFace-REST

InsightFace REST API for easy deployment of face recognition services with TensorRT in Docker.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unable to process images which are received from socketio image buffer

bhargavravat opened this issue · comments

hi @SthPhoenix :

if i run the code on the images which are there on my machine then it is working fine. However when I am trying to process the image which i receive from the socket connection , I am getting this error :

Will you be able to give me some hint in this case ?

[TensorRT] ERROR: ../rtSafe/cuda/reformat.cu (925) - Cuda Error in NCHWToNCHHW2: 400 (invalid resource handle)
[TensorRT] ERROR: FAILED_EXECUTION: std::exception
[TensorRT] ERROR: ../rtSafe/cuda/reformat.cu (925) - Cuda Error in NCHWToNCHHW2: 400 (invalid resource handle)
[TensorRT] ERROR: FAILED_EXECUTION: std::exception

Note :
Before sending it to inference I am following below steps :

#Model 
model = FaceAnalysis(max_size=[640, 640], backend_name='trt', det_name='retinaface_mnet025_v1', 
			rec_name='arcface_r100_v1',max_rec_batch_size=64)

#here i recv the image
recvdImg  =  data.get("frame")

#to numpy array
nparr = np.frombuffer(recvdImg,dtype=np.uint8,offset=1)

#decoding the image
img_np = cv2.imdecode(nparr,-1)

# doing inferece :
faces = model.get(img_np)

By reading few blogs I found that this error has something to do with cuda context and this can be solved by

saving the CUDA context.
Issue here is CUDA context refreshed and mixed up with other applications.

Can you help me out now?

Could you please provide more info on how image is sent and received?

I am connected to a server !

Server is sending me an image in byte array format.

Then I am converting that byte array into numpy array and then processing it further.

I am only receiving image , not sending!

It's hard to tell which reason might cause this exception without additional info, but I think it's mostly connected to the way you read the image before feeding it for inference.
I have tested this code with reading images from rabbitmq query, and from fastapi endpoint, which is default behavior of this project, and haven't got such exceptions.
If you could provide minimal reproducible example I could try checking it.

Could you also provide minimum server example? For testing full cycle.

EDIT: Have you tried to save image decoded by opencv to disk with cv.imwrite?

Yes , image is perfect the way it should be !
There is no issues with the input image that's for sure ! I have cross checked with storing that image in local storage and also checked it , it is how it should be !!

Found the solution!!
Closing it

Great news, @bhargavravat! Could you please post your solution? I'll apply it to master branch if it doesn't conflict with FastAPI.

I have modified trt_loader.py (/app/modules/model_zoo/exec_backends/trt_loader.py)

Attaching it as .txt format

trt_loader.txt

I have tested your solution with /src/converters/pipeline_tester.py, but it fails with following error:

PyCUDA ERROR: The context stack was not empty upon module cleanup.

I was able to fix it by replacing line 84:

self.cuda_ctx = cuda_ctx

with:

device = cuda.Device(0)  # enter your gpu id here
ctx = device.make_context()
self.cuda_ctx = ctx

This worked, but caused excessive GPU RAM usage, about 1GB

hello @SthPhoenix how to create 1 engine (ex face detection) serve for more than 2 camera using threading?
i tried, but i got confused outputs between these thread.

Hi @ThiagoMateo ! As I have said before, this use case is not tested, for now you can use provided rest API in your application.
In future versions I'm planning to add full support of Triton Inference Server, which will give you ability to serve single model for multiple threads.