hehefan / P4Transformer

Implementation of the "Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos" paper.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Intuition behind choice of input points, ball query radius, nsamples and spatial-stride

sheshap opened this issue · comments

Dear @hehefan ,

From the train-msr.py file, I see the default values
input points = 2048,
ball query radius = 0.7,
nsamples = 32,
spatial-stride = 32

On a given point cloud frame that contains 2048 points, the farthest points sampled (FPS) i.e., 32 are sampled (smaller). Around each of them, a ball with a radius of 0.7 (bigger) is used to query 32 points (very small number).

I have a few questions to understand the setting.

  1. Wouldn't the queried points (32) around the farthest point be very close to it since the overall object contains 2048 points(dense)?
  2. What is the intuition behind choosing a bigger radius but querying only 32 points at each FPS point?

I suspect due to 2048 input points, but querying only 32 points around each of 32 farthest points, the points considered from a given frame at a time are limited to 32x32 i.e., 1024 points. And these points are clusters of 32 points around each of the 32 farthest points due to ball querying.

Please help me understand the intuition behind the design.

Thanks in advance.

Hi,

spatial-stride is the FPS down-sampling rate. This operation leads to 2048/32=64 anchor points.

For each anchor point, we search its neighbors within the radius(0.7).

Because points usually have different numbers of neighbours, within the ball query radius, we sample nsamples(32) neighbors.

In summary, given a frame with 2048 points, we select 64 anchor points. For each anchor point, we sample 32 neighbors whose distances to the anchor point are less than 0.7.

Because points are significantly down-sampled, we use a bigger radius to include more neighbours for each anchor point.

Best regards

Thank you @hehefan.

There will be a lot more than 32 points within the ball that are at a distance <= 0.7.

The CUDA code (of pointnet++) picks the nearest ones from the anchor point though this is a ball query and not a kNN query.

Your method would still work the same even if the radius was 0.2 (I will run this experiment).

Using a bigger radius alone will not fetch more neighbors unless the nsamples value is high. In your case, it is just 32.

Can you confirm if the queried points were picked randomly within the ball and not based on how near it is to the anchor point?

Hi,

The ball query ball_query_gpu.cu in PointNet++ searches neighbors from the first point to the last point in a cloud. If enough (nsample) neighbors are collected, the search stops. Therefore, the collected neighbors are related to their order in the cloud. Because point clouds are unordered and points are randomly selected in data preprocessing, we can consider the neighbor sampling to be random.

Best regards.