a question about the time consumed by copying tensor from cpu to gpu

Question

a question about the time consumed by copying tensor from cpu to gpu

qimw opened this issue 5 years ago · comments

Hi all, I am trying to add lighthead module on faster rcnn. I add a new light_branch on faster_rcnn_heads.py which conbines the functions of box_head and box_out. This module needs to move rois in rpn_return from cpu to gpu but I find this operation much slowly than the roi_feature_transform in module_builder.py (0.04s vs <1e-3s).
Then I move these operations on module_builder.py, and the time is much less, but the time of get restore_bl increased to 0.04s.
This question puzzled me for a long time and could any one figure it out?

Gengcong Yang · Answer 1 · Fri May 22 2020 12:17:07 GMT+0800 (China Standard Time)

I am also puzzled by this problem...
Have you figured it out yet?