Unable to process images which are received from socketio image buffer

Question

Unable to process images which are received from socketio image buffer

bhargavravat opened this issue 3 years ago · comments

if i run the code on the images which are there on my machine then it is working fine. However when I am trying to process the image which i receive from the socket connection , I am getting this error :

Will you be able to give me some hint in this case ?

[TensorRT] ERROR: ../rtSafe/cuda/reformat.cu (925) - Cuda Error in NCHWToNCHHW2: 400 (invalid resource handle)
[TensorRT] ERROR: FAILED_EXECUTION: std::exception
[TensorRT] ERROR: ../rtSafe/cuda/reformat.cu (925) - Cuda Error in NCHWToNCHHW2: 400 (invalid resource handle)
[TensorRT] ERROR: FAILED_EXECUTION: std::exception

Note :
Before sending it to inference I am following below steps :

#Model 
model = FaceAnalysis(max_size=[640, 640], backend_name='trt', det_name='retinaface_mnet025_v1', 
			rec_name='arcface_r100_v1',max_rec_batch_size=64)

#here i recv the image
recvdImg  =  data.get("frame")

#to numpy array
nparr = np.frombuffer(recvdImg,dtype=np.uint8,offset=1)

#decoding the image
img_np = cv2.imdecode(nparr,-1)

# doing inferece :
faces = model.get(img_np)

bhargavravat · Answer 1 · Mon Jan 25 2021 17:08:13 GMT+0800 (China Standard Time)

By reading few blogs I found that this error has something to do with cuda context and this can be solved by

saving the CUDA context.
Issue here is CUDA context refreshed and mixed up with other applications.

Can you help me out now?

SthPhoenix · Answer 2 · Mon Jan 25 2021 17:48:21 GMT+0800 (China Standard Time)

Could you please provide more info on how image is sent and received?

bhargavravat · Answer 3 · Mon Jan 25 2021 19:09:19 GMT+0800 (China Standard Time)

I am connected to a server !

Server is sending me an image in byte array format.

Then I am converting that byte array into numpy array and then processing it further.

I am only receiving image , not sending!

SthPhoenix · Answer 4 · Mon Jan 25 2021 19:22:15 GMT+0800 (China Standard Time)

It's hard to tell which reason might cause this exception without additional info, but I think it's mostly connected to the way you read the image before feeding it for inference.
I have tested this code with reading images from rabbitmq query, and from fastapi endpoint, which is default behavior of this project, and haven't got such exceptions.
If you could provide minimal reproducible example I could try checking it.

SthPhoenix · Answer 5 · Mon Jan 25 2021 21:04:53 GMT+0800 (China Standard Time)

Could you also provide minimum server example? For testing full cycle.

EDIT: Have you tried to save image decoded by opencv to disk with cv.imwrite?

bhargavravat · Answer 6 · Tue Jan 26 2021 10:56:23 GMT+0800 (China Standard Time)

Yes , image is perfect the way it should be !
There is no issues with the input image that's for sure ! I have cross checked with storing that image in local storage and also checked it , it is how it should be !!

bhargavravat · Answer 7 · Thu Jan 28 2021 20:25:17 GMT+0800 (China Standard Time)

Found the solution!!
Closing it

SthPhoenix · Answer 8 · Thu Jan 28 2021 23:45:49 GMT+0800 (China Standard Time)

Great news, @bhargavravat! Could you please post your solution? I'll apply it to master branch if it doesn't conflict with FastAPI.

bhargavravat · Answer 9 · Fri Jan 29 2021 20:41:30 GMT+0800 (China Standard Time)

I have modified trt_loader.py (/app/modules/model_zoo/exec_backends/trt_loader.py)

Attaching it as .txt format

trt_loader.txt

SthPhoenix · Answer 10 · Sat Jan 30 2021 05:56:00 GMT+0800 (China Standard Time)

I have tested your solution with /src/converters/pipeline_tester.py, but it fails with following error:

PyCUDA ERROR: The context stack was not empty upon module cleanup.

I was able to fix it by replacing line 84:

self.cuda_ctx = cuda_ctx

with:

device = cuda.Device(0)  # enter your gpu id here
ctx = device.make_context()
self.cuda_ctx = ctx

This worked, but caused excessive GPU RAM usage, about 1GB

ThiagoMateo · Answer 11 · Mon Mar 15 2021 18:04:41 GMT+0800 (China Standard Time)

hello @SthPhoenix how to create 1 engine (ex face detection) serve for more than 2 camera using threading?
i tried, but i got confused outputs between these thread.

SthPhoenix · Answer 12 · Mon Mar 15 2021 23:21:10 GMT+0800 (China Standard Time)

Hi @ThiagoMateo ! As I have said before, this use case is not tested, for now you can use provided rest API in your application.
In future versions I'm planning to add full support of Triton Inference Server, which will give you ability to serve single model for multiple threads.