RandomShift or RandomRotate

Question

RandomShift or RandomRotate

Lizhinwafu opened this issue 3 months ago · comments

I noticed that the network uses some data augmentation techniques during training, such as point cloud rotation or translation. Can these data augmentation techniques improve the accuracy of model training?

Xiaoyang Wu · Answer 1 · Thu Apr 25 2024 18:30:28 GMT+0800 (China Standard Time)

Mostly, they can.

Lizhi Jiang · Answer 2 · Thu Apr 25 2024 22:40:34 GMT+0800 (China Standard Time)

In my research, I manually rotated the raw point clouds along different axes. This allowed for manual data augmentation. I use PointNet++ model. My mIoU did improve indeed. However, after submission, the reviewer raised some questions, and I am unsure how to respond. Can you give some suggestions
“Although the authors described the data expansion method. However, for the PointNet family of models, the T-Net transformation matrix of the model is adjusted for the rigid transformation after the data are input to the model. This is a feature of the input point cloud and feature alignment module in PointNet. In addition, PointNet proposes symmetric functions to solve the point cloud disorder, which leads to the subsequent 3D recognition task. Overall, the symmetry function and the T-Net transform matrix adjust the input data to ensure effective training of the model. Therefore, I question the rotation as a data expansion method proposed in the article”

Xiaoyang Wu · Answer 3 · Fri Apr 26 2024 00:02:23 GMT+0800 (China Standard Time)

Oh, it is a question hard to answer. I am also unsure how to respond, but I have attached one version just for your reference:

Dear reviewer, thank you for pointing out such a crucial point about the nature of point cloud processing. We believe it is an important issue worth discussing and will be included in our final version. In a word, it is the symmetric functions that grant the model the potential to handle the unordered nature of the point cloud, and it is augmented data that grants the model the actual capacity to encode complex point clouds robustly by learning. Imagen that, when feeding a point cloud rotated with different angles to a PointNet++ without training with augmentation, even the point clouds are encoded with the symmetric local T-Net transformation matrix (also can be viewed as a kind of permutation invariance adaptive kernel, but of course not input invariance), as the input feature (coord) of each point is different, the output representation of each point should also be different. Yet we know that, fundamentally, these rotated point clouds should share a similar representation as the meaning of the given point cloud, as well as given points, shouldn't change after rotation. So, when including data augmented with rotation in the training process, the model will learn the ability from this prior information.

Hope my version of response can be helpful. Maybe the reviewer misunderstood that permutation invariance doesn't mean input invariance. Rotation will change the input feature of the input point cloud.

Lizhi Jiang · Answer 4 · Fri Apr 26 2024 00:05:04 GMT+0800 (China Standard Time)

Haha, thank you so much. Although I do not agree with the reviewer's opinion, because pointnet++ does not seem to have a T-net structure.

Xiaoyang Wu · Answer 5 · Fri Apr 26 2024 00:08:37 GMT+0800 (China Standard Time)

Haha, thank you so much. Although I do not agree with the reviewer's opinion, because pointnet++ does not seem to have a T-net structure.

Haha, never say no to your reviewer. ╮(╯▽╰)╭