Yang7879 / 3D-BoNet

Hi, thanks for your great work! Can you explain the meaning of 'ins_labels' and 'sem_labels' in helper_data_s3dis.py line 44 and 45 . I am not sure if I understand correctly. Looking forward your reply!

hi @pomelo93, sem_labels --> the ground truth category of each point, ins_labels --> the object instance label of each point. You are suggested to visualize the semantic and instance labels via the lines below:

3D-BoNet/helper_data_s3dis.py

Line 76 in 37bf0be

## if u need to visulize data, uncomment the following lines

hi @Yang7879 ,thank for your reply! I have another question can not understand in your paper. You said : ''Different color indicates different instance. The same instance may not have the same color'' in Figure 7. Why the same instance may not have the same color? The same instance means the same object or the same category? Looking forward your reply!

@pomelo93 To plot the instance segmentation results of different approaches, we generate a random color for each object instance, so the same object may not have the same color for different approaches.

@Yang7879,你好，我是不是可以这样理解。假如一个room有5个instance，分别为：2chairs==>ins_id=0,ins_id=1，1table==>ins_id=2，1sofa==>ins_id=3，1bookcase==>ins_id=4.那么maybe ins_labels=[0,0,1,1,1,2,2,3,4,3,...]，在helper_data_plot.py中，np.unique(pc_semins)=[0,1,2,3,4],也就意味着随机选择color时，不同的实例ins_id也就是0，1，2，3，4会显示不同的颜色，但是ins_id =0,1都是chair，同一个类别，也是显示不同的颜色。也就是说，plot时，给每一个实例分配一种颜色，但是不同的实例可能是同一个类别，比如有2个实例都是chair。但是，如果我在可视化的时候，helper_data_plot.py 37行draw_pc_semins 传参 pc_semins = sem_labels,尽管来自不同实例，只要它们类别一样，比如2个chairs，还是会显示一样的颜色。另外，还有一个疑问，在预测的时候，每一个room会最多预测24个instance吗？我看代码中ins_max_num = 24。期待你的回复，感谢！

@Yang7879 ，您好，是不是这样啊，比如在预测的时候，其实一个room里假如有4096个点来自于同一个对象，比如都是sofa，但是预测的时候可能会把其中比如300个点预测为另外一个对象，这样，可视化的时候同一个对象就显示为不同的颜色了。对吗？另外，我上面的comment理解有误吗？谢谢！

@pomelo93 你好，上面所有的理解都是完全对的~ 另外ins_max_num = 24 就是设定最多24个instance，这个数值可以根据你自己的dataset 设定。

@Yang7879 ，你好，非常感谢你的回复！还有一个小疑问啊，每一个room 会分成若干个blocks，训练及测试时都是batch个block 批量输入，那么ins_max_num=24 应该是针对每一个block最多预测24个instance吧，所以最后block merge后，一个room可能会预测很多个instance，我打印了Area_5_hallway_1.h5 的测试结果也即是Area_5_hallway_1.h5.mat，其中np.unique(ins_pred_all)=[0,1,2,...,145]，也就是说Area_5_hallway_1这个point cloud 预测了145个instance，对吗？但是我看S3DIS raw data Area_5_hallway_1 里面的Annotations里面只有26个instance，为什么instance 预测差异这么大？ main_eval.py line 228-236 我理解的应该是remove small instance，根据main_eval.py line 235 参数0.3 或者0.5等过滤保留预测的更有效的instance，那么怎么对remove后的instance可视化，remove的instance可能是来自于同一个instance 也可能是不同的instance但是同一caterogy。另外，np.unique(sem_pred_all)总是[0,1,2,...12]吗？因为我打印了几个不同的.mat文件，发现总是预测了13类。还有，在你的paper中，2.3 节Point Mask Prediction 中，把所有预测的bounding boxes 和 point features 以及global features F g输入做处理。为什么不把之前Bounding Box Association Layer 匹配好T个pair的bounding boxes以及对应features作为point mask prediction输入，反而要对所有预测的bounding boxes 预测mask？我不太理解。非常期待您的解答！感谢！

@pomelo93
(1) 预测的instance 数量和gt数量差异大是因为 blockmerge算法并不能准确的把blocks合并到一起，产生了很多细碎的instances，这一步处理基本上是个bottleneck。当然理想情况是网络可以一次性把一个房间全部test了，直接吐出所有instance，但当时用的pointnet++ backbone并不能处理大规模点云，需要新的backbone才行，个人认为这是一个很好future work。
(2) 如果remove的细碎instances 需要visualize，就随机生成一些颜色保存。
(3) sem_pred_all 最多预测13个类，不一定非要13个。
(4) 在train的时候，确实可以只用匹配好的T个bbox去预测mask，我只是为了代码简单省事就都扔进去的。

(1) 预测的instance 数量和gt数量差异大是因为 blockmerge算法并不能准确的把blocks合并到一起，产生了很多细碎的instances，这一步处理基本上是个bottleneck。当然理想情况是网络可以一次性把一个房间全部test了，直接吐出所有instance，但当时用的pointnet++ backbone并不能处理大规模点云，需要新的backbone才行，个人认为这是一个很好future work。
(2) 如果remove的细碎instances 需要visualize，就随机生成一些颜色保存。

@Yang7879 Hello, I can't read Chinese, but I think I got the gist of the above conversation using google translate.

Does ins_max_num = 24 mean it will predict up to 24 instances per 4096-size block, and that after merging it could have potentially 24 * num_blocks total instances? Also, if I want to keep the fine-grained instance predictions, what do I need to change? I am trying a custom scene dataset with many more (and smaller) instances than s3dis. If I increase ins_max_num too much, I run into gpu memory problems. Thank you!

@Yang7879 ，您好，是不是这样啊，比如在预测的时候，其实一个room里假如有4096个点来自于同一个对象，比如都是sofa，但是预测的时候可能会把其中比如300个点预测为另外一个对象，这样，可视化的时候同一个对象就显示为不同的颜色了。对吗？另外，我上面的comment理解有误吗？谢谢！请问h5文件中的coords是什么意思呢？跟point有何不同

请问h5文件中的coords是什么意思呢？跟point有何不同

The meaning of ins_labels and sem_labels