openvinotoolkit / training_extensions

Train, Evaluate, Optimize, Deploy Computer Vision Models via OpenVINO™

Home Page:https://openvinotoolkit.github.io/training_extensions/

Repository from Github https://github.comopenvinotoolkit/training_extensionsRepository from Github https://github.comopenvinotoolkit/training_extensions

OTX2.0 exported model test issue: "Encountered different devices in metric calculation"

goodsong81 opened this issue · comments

Describe the bug

Device mismatch when running otx test exported model without "--config src/otx/recipe/{model.task}/openvino_model.yaml"

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            The above exception was the direct cause of the following exception:
...
RuntimeError: Encountered different devices in metric calculation (see stacktrace for details). This could be due to the metric class not being on the same device as input. Instead of `metric=MultilabelAccuracy(...)` try to do `metric=MultilabelAccuracy(...).to(device)` where device corresponds to the device of theinput.

[Failing tasks]

  • Multi-label classification
  • H-label classification
  • Tiling-instance segmentation
  • Semantic segmentation
  • Visual prompting

Steps to Reproduce

otx train --data_root tests/assets/multilabel_classification/ --work_dir /tmp/mlc
otx export --work_dir /tmp/mlc
otx test --work_dir /tmp/mlc --checkpoint /tmp/mlc/.latest/export/exported_model.xm

(Similar for other tasks)

@kprokofi I don't think this is an issue, but what do you think? If we are testing the ir model, we should definitely specify --engine.device cpu, right?

Yes, you are right. Specifying --engine.device cpu will work
However, we may want to add auto configuration for device

Simply adding two lines will help
image

#3220 Merged into 2.0