transforn to onnx occur warning

Question

transforn to onnx occur warning

dengfenglai321 opened this issue 4 years ago · comments

hi
Itransfer rexnet to onnx and occur warning as below:
F:\rexnetv1.py:122: TracerWarning: There are 2 live references to the data region being modified when tracing in-place operator add_. This might cause the trace to be incorrect, because all other views that also reference this data will not reflect this change in the trace! On the other hand, if all other views use the same memory chunk, but are disjoint (e.g. are outputs of torch.split), this might still be safe.
out[:, 0:self.in_channels] += x
F:\rexnetv1.py:122: TracerWarning: There are 4 live references to the data region being modified when tracing in-place operator copy_ (possibly due to an assignment). This might cause the trace to be incorrect, because all other views that also reference this data will not reflect this change in the trace! On the other hand, if all other views use the same memory chunk, but are disjoint (e.g. are outputs of torch.split), this might still be safe.
out[:, 0:self.in_channels] += x

could you help me?
I think it make the result of onnx is different from pytorch

jasonLee · Answer 1 · Tue May 25 2021 02:53:59 GMT+0800 (China Standard Time)

@cendelian Sorry to bother you . I met the same problem you said . Did you solve it ? If you did , could you tell me how did you do ? Thanks a lot .

Dongyoon Han · Answer 2 · Wed May 26 2021 14:59:04 GMT+0800 (China Standard Time)

@ntut108318099 If a conversion error (or warning) at the shortcut connection occurred, please use the following alternative. With torch.onnx.export, I have confirmed that our model has been successfully converted.

slice_zero = torch.narrow(out, dim=1, start=self.in_channels, length=self.out_channels - self.in_channels)
slice_sum = torch.narrow(out, dim=1, start=0, length=self.in_channels) + x
out = torch.cat((slice_sum, slice_zero), dim=1)

jasonLee · Answer 3 · Wed May 26 2021 15:06:32 GMT+0800 (China Standard Time)

@dyhan0920 Thanks for your reply . I also used another way to transfer successfully .

    def forward(self, x):
        out = self.out(x)

        if self.use_shortcut:
            out=torch.cat((torch.add(out[:, :self.in_channels],x),out[:,self.in_channels:]),1)
            #out[:, :self.in_channels] += x
        return out

But I use torch2trt(https://github.com/NVIDIA-AI-IOT/torch2trt) wrapper to transfer the module .
I don't know if the code can transfer by onnx .

I would try your method to see the result . I think your method is greater than me .

Thanks for your help .
Have a nice day.

Dongyoon Han · Answer 4 · Tue Jun 01 2021 13:16:07 GMT+0800 (China Standard Time)

@ntut108318099 Thanks for sharing the implementation. I think your implementation would also work very well.
I have used ONNX as a proxy framework to transfer to TF2 or TF1, which was successfully done.
I will get to confirm the transferability of the code to TensorRT.

Please let me know if you run into any problems or need help.
Thank you for your interest in our model!

jasonLee · Answer 5 · Sat Jun 19 2021 14:50:59 GMT+0800 (China Standard Time)

@dyhan0920 Sorry to bother you again.
Your code could run well with torch2trt, but I found if I changed your code to inference, the inference time would be much longer than the code I provided.

I thought it might be caused by torch.narrow module. You could try it, maybe my setting had anything wrong.
Thanks for your help.

I much appreciated your work. I would like to cite your paper in my research.

But here's some question I want to ask:

(1) I found your paper in Arxiv updated three times.
Could you tell me what different about those papers?

(2) I found the channels of ReXNet-lite-1.0 are different from ReXNet.

ReXNet-lite: [32, 16, 32, 40, 48, 64, 72, 80, 96, 104, 112, 128, 136, 144, 160, 168]
ReXNet: [32, 16, 27, 38, 50, 61, 72, 84, 95, 106, 117, 128, 140, 151, 162, 174]

How did you find the rule of the ReXNet-lite channels? I have read your paper, but I didn't find the reason.
If you said the reason in the paper, could you tell me where I can find it? Thanks~

Thanks for your reading.
Have a nice day.

Dongyoon Han · Answer 6 · Mon Jun 21 2021 00:28:10 GMT+0800 (China Standard Time)

@ntut108318099 Thank you very much for reporting the actual inference speed comparison of the codes on Trt. I also confirmed that torch.narrow does not seem to be a suitable module for the Trt conversion currently.

I would like to give answers to the questions you asked: for the first question, the updates have been successively done due to 1) preparing the CVPR 2021 submission (v1->v2) by refining the paper with more empirical backups for our proposed architecture(e.g., we included the model search results that could be a backup for how we set up the entire channel configuration of ReXNet); 2) adding RexNet-lite results (v2->v3). Note that the ReXNet architecture has remained unchanged from v1 to the latest paper v3.

I applied _make_divisible function to the entire channels of ReXNet-lite to involved the divisibility of the channels with 8, so the channel values have changed look quite non-intuitively. The resulted model did not have proper computational costs such as FLOPs as I had expected, so I adjusted the channel sizes, which seem to be different from those of ReXNet.

Since ReXNet-lites are designed for practitioners and were not supposed to be the content of our paper, so the details were missing in the paper. Sorry for the confusion.

Thanks again for your interest in our work!
Have a nice day too!