is that possible to use full INT8 during tiny-yolo-v3

Question

is that possible to use full INT8 during tiny-yolo-v3

jasonwu1977 opened this issue 5 years ago · comments

jasonwu1977 commented 5 years ago

Hi @AlexeyAB
I would like to know is that possible to use full INT8 during all yolo-light2

after pre-processing layer
is that possible to convert it to INT8 before entering the first layer
can i use INT8/INT16 bias
can I save output to 8 bit before next layer (of course befoer yolo layer it still float32)

jasonwu1977 · Answer 1 · Tue Aug 13 2019 18:38:15 GMT+0800 (China Standard Time)

@AlexeyAB
Answer for myself
All 3 questions are yes, and i have tested with CPU

I want to run quantization with pure CPU mode.
And When I change the Makefile to let yolo2-light running on CPU mode.
the mAP drops to very low, after I trace the code, i noticed the quantization of CUDNN mode is different with running GPU mode (some layer don't quantize in CUDNN mode)

Now the question is, on yolov2 or yolov3, on CUDNN examples, the size=1 & stride=2 layer doesn't do the quantize, but is it possible to do quantize with CPU mode?
And if i want to change the CPU to support all convolution layer quantization, where should i moidfy?

jason4345099 · Answer 2 · Sun Jun 07 2020 00:03:26 GMT+0800 (China Standard Time)

@jasonwu1977 hi Jason, I met similar issue as you, could you can talk it more about it through my qq 494529371? big thanks.

Joejwu · Answer 3 · Thu Dec 09 2021 14:17:50 GMT+0800 (China Standard Time)

Have you sloved it out yet? or anyone else know?