Question about the setting

Question

Question about the setting

zhihou7 opened this issue 2 years ago · comments

Hi, thanks for your interesting work.
I am confused about the setting. Does the setting use the same training data with HCMoCo for down-stream tasks? I mean are there any difference between the pre-training modalities and down-stream modality? Maybe, I miss something in the paper. But I do not find an apparent introduction. I might not understand this description well.

To evaluate HCMoCo, we transfer our pre-train model to four human-centric downstream tasks using different modalities,

Fangzhou Hong · Answer 1 · Wed Mar 30 2022 09:37:36 GMT+0800 (China Standard Time)

Thank you for your interest in our work.

HCMoCo uses RGB, depth and 2d keypoints for pre-train. And we transfer the pre-trained RGB backbone to DensePose prediction and RGB human parsing. We transfer the pre-trained depth backbone to depth human parsing and depth 3d skeleton prediction.

The modalities are the same. But for some down-stream tasks like DensePose estimation and RGB human parsing, we use different datasets (MPII and NTURGBD for pre-train while COCO or Human3.6M for down-stream evaluation) which brings domain gap.

I hope the above explanation clarifies your confusion.

Zhi Hou · Answer 2 · Wed Mar 30 2022 10:35:41 GMT+0800 (China Standard Time)

Thanks for your reply. I get it.