OpenGVLab / InternImage

[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

Home Page:https://arxiv.org/abs/2211.05778

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Latency too high with DCNv3_pytorch op on cpu

xbkaishui opened this issue · comments

Hi guys:

when use DCNv3_pytorch for inference, the letency is too high for cpu only device
I test with both cuda and cpu infer
for cuda device , per image is 29 ms, with DCNV3 version the cost is 20ms
for cpu device, per image is 550ms (only test with pytorch)

do you have any idea to optimize the cpu infer cost?

thanks