About the metrics of preprocessed Co3d_v2 and the post-training step

Question

VillardX opened this issue 2 months ago · comments

Hi, thanks for your great work. It really works well on my own dataset with your pretrained model.

Here I still have some questions.

The readme provides the preprocess.py of dataset co3d_V2, I wonder the some infos about the preprocessed data.
1. I wonder what is the metrics of depth map? I see when training, you divided it by 65535 in co3d.py. Does it mean meters after division?
2. And the $T$ vector in your preprocessed output is in meters?
3. Are the camera extrinsics $R$ C2W?
4. Is 'selected_seqs_test.json' the list of rgb index to make pairs?
The provided pretrained model works well on my own dataset. However, the global alignment step is time consuming.
1. I find that the train loss is essentially consisted of a image-pair, not including multi-view contents, which means the global alignment is a post-processing step and is not included in the training step. Is my understanding correct?
2. In my situation, the trainset input views' poses are the same as the testset. Is it feasible to use my own trainset to post-train your provided model and avoid global alignment in the inference step. Could you please give me some advice?

Best regards,
VillardX