[Discuss on consistency] Custom data on Realsense L515+ORB-SLAM3+Open3D Reconstruction

Question

[Discuss on consistency] Custom data on Realsense L515+ORB-SLAM3+Open3D Reconstruction

glennliu opened this issue 2 years ago · comments

Chuhao Liu commented 2 years ago

Hi Shaoyun

Thanks for your work. I'm interested in how HAIS can help the SLAM system extract persistent semantic landmarks.

I have tested HAIS on my own dataset,

Hardware: Intel Realsense L515 RGB-D camera
Localization: ORB-SLAM3 with purely RGB-D input
Reconstruction: Open3D RGB-D integration, which is based on TSDF and extract point cloud after the entire scan is finished.

Here are some results (colored by raw RGB and semantic segmentation),
a. Living Room

b. dinning room

c. study room

The quantitative results show there is quite many over-segmentation. In the ScanNet test results, I can also see a few of the over-segmentation problems. But it is way less frequent than in my customized dataset. Moreover, over-segmentation in ScanNet dataset normally occurs in those poorer reconstructed sub-volumes, while it occurs in well-reconstructed spaces in the customized data, such as the Living Room scan.

So, what is actually affecting the segmentation performance in the self-collected dataset?

Previous issues #19 also discuss the issues. And some says the class_numpoint_mean_dict should be modified. But I'm running in indoor scenes similar to ScanNet. mean_numpoint and mean_radius should be only slightly different. That dictionary parameter should not affect the consistency across datasets in my case.
How is the domain shift of HAIS? What kind of issues can affects the consistency across different dataset?

Thanks
Chuhao

Chuhao Liu · Answer 1 · Wed Oct 19 2022 16:41:44 GMT+0800 (China Standard Time)

I also runs it on several SceneNN dataset.

The scene is also constructed by Open3D integration, but using the ground-truth pose.

I can also see some over-segmentation, such as the broken chair and table.
I understand this is a long-term challenge in scene segmentation tasks and cannot be perfectly solved. But I just want to discuss what affects it and how to improve it across different dataset.

Thanks

Shaoyu Chen · Answer 2 · Thu Oct 20 2022 22:03:07 GMT+0800 (China Standard Time)

Thanks for the interesting question. Over-seg. and under-seg. are related to the instance seg. but not semantic seg. Your visualization reflects that class labels of some points are wrongly predicted, thus the prolem lies in the first stage. HAIS is trained on ScanNetV2 whose data amount is far away from enough. I think the data amount accounts for the wrong prediction and maybe adopts more data can alleviate the problem. And there do exist gaps between datasets, e.g., point cloud density. Some parameters may bu tuned for better results in other datasets.
Hope these help you.

Chuhao Liu · Answer 3 · Fri Oct 21 2022 10:17:45 GMT+0800 (China Standard Time)

@outsidercsy Thanks for your time.
I agree with your points that data amount is one of the reasons. However, the datasets are already close to ScanNet data.
Those scene I choose are very similiar to some scenes in ScanNet (indoor living room etc.). And ScanNet is reconstructed using TSDF mapping, which is the same as what I used and SceneNN used. The RGB-D cameras we used are different but should not be too much difference.Besides, HAIS is trained with 1200 scans from ScanNet which is not a small amount. So, I was expecting very similar performance as ScanNet.
Overall, I think the data amount does matter. But there should be a more general method to improve the consistency of the segmentation network.

Shaoyu Chen · Answer 4 · Fri Oct 21 2022 16:31:24 GMT+0800 (China Standard Time)

I mean that the data amount of ScanNet is also not enough for convergence. And I also observe many cases with poor segmentation results on ScanNet. What if pretraining on ScanNet and then finetuning on your dataset?

Chuhao Liu · Answer 5 · Sat Oct 22 2022 10:50:31 GMT+0800 (China Standard Time)

Some subvolumes in ScanNet are also reconstructed poorly. It should be reasonable that those poor volumes in ScanNet cannot be well segmented. I'd like to try finetuning my dataset. Do you have any suggestion on which parts of the parameter should be considered in finetuning?

Chuhao Liu · Answer 6 · Sat Oct 22 2022 11:10:49 GMT+0800 (China Standard Time)

Here are more results (raw rgb and instance segmentation+semantic text)
living_room

washing_room

bedroom

Beyond over-segmentation, there are a few falsely predicted semantic labels.

Shaoyu Chen · Answer 7 · Tue Nov 01 2022 12:08:35 GMT+0800 (China Standard Time)

Thanks for the visualizaiton. As for data amount, ScanNet200 is released with much more annotated scenes and instances. More training data helps to solve the bad cases. https://rozdavid.github.io/scannet200