Performance questions

Question

Performance questions

HoBeom opened this issue 3 years ago · comments

Thank you for your good research. And thank you for revealing the code quickly.
I tried to implement your code in the same environment.
However, using a single GPU environment, the batch have become 1/2 size. And I got the following results.
edit config line 4 : GPUS: (1,) (in my environment using 0:3080, 1:2080TI)

Posetrack 2017 val

Model	Head	Shoulder	Elbow	Wrist	Hip	Knee	Ankle	Mean
DcPose_RSN	86.3907	87.718	83.2292	76.2394	80.1681	79.1894	71.2038	80.9779

Posetrack 2018 val

Model	Head	Shoulder	Elbow	Wrist	Hip	Knee	Ankle	Mean
DcPose_RSN	83.8176	86.2703	81.4414	75.3437	77.1077	77.9727	72.2061	79.4758

The result of not achieving the performance suggested is because the batch size is small?

I would like to change the deform conv module to torchvision to run your code on CUDA11.1.
https://pytorch.org/vision/stable/_modules/torchvision/ops/deform_conv.html#deform_conv2d

I also encountered an error in the posetrack 2017 test dataset.

2021-04-02 14:23:13 [engine.core.function] INFO: test: [3100/5462]      Time 1.659 (1.713)      Data 0.027s (0.083s)    Accuracy 0.000 (0.006)
2021-04-02 14:25:59 [engine.core.function] INFO: test: [3200/5462]      Time 1.659 (1.712)      Data 0.027s (0.081s)    Accuracy 0.000 (0.006)
Traceback (most recent call last):
  File "run.py", line 33, in <module>
    main()
  File "run.py", line 29, in main
    runner.launch()
  File "/DCPose/engine/defaults/runner.py", line 63, in launch
    evaluator.exec()
  File "/DCPose/engine/defaults/evaluator.py", line 20, in exec
    self.eval()
  File "/DCPose/engine/defaults/evaluator.py", line 73, in eval
    phase=self.phase)
  File "/DCPose/engine/core/function.py", line 165, in eval
    input_x, input_sup_A, input_sup_B, target_heatmaps, target_heatmaps_weight, meta = next(self.dataloader_iter)
  File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data
    return self._process_data(data)
  File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data
    data.reraise()
  File "/opt/conda/lib/python3.7/site-packages/torch/_utils.py", line 394, in reraise
    raise self.exc_type(msg)
AttributeError: Caught AttributeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/DCPose/datasets/zoo/posetrack/PoseTrack.py", line 100, in __getitem__
    return self._get_spatiotemporal_window(data_item)
  File "/DCPose/datasets/zoo/posetrack/PoseTrack.py", line 166, in _get_spatiotemporal_window
    self.logger.error(error_msg)
AttributeError: 'PoseTrack' object has no attribute 'logger'

runyangFeng commented 3 years ago

@HoBeom

runyangFeng · Answer 1 · Wed Apr 07 2021 10:26:42 GMT+0800 (China Standard Time)

About the performance,

May be some details are different.
We have not released our final model because it involves our next works.

Abour the error,
logger is a tools to print info, you can annotate it if some errors in it.

jeonhobeom · Answer 2 · Wed Apr 07 2021 11:21:07 GMT+0800 (China Standard Time)

Thank you for responding. I'll figure out the error. I have changed your code to use the CUDA 11.1 version and use the deform conv module as an implementation of Torchvision (forked code in https://github.com/HoBeom/DCPose). The following results have been obtained with the same setting, and we would like to conduct a follow-up study on the basis of this study. Thank you.
Posetrack 2018 val (Using torchvison.ops.deform_conv2d) batch size 32, single 2080TI 20 epoch

Model	Head	Shoulder	Elbow	Wrist	Hip	Knee	Ankle	Mean
DcPose_RSN	83.7925	86.4388	81.4788	75.82	77.8675	78.0722	72.4103	79.7035