Pointcept / Pointcept

Pointcept: a codebase for point cloud perception research. Latest works: PTv3 (CVPR'24 Oral), PPT (CVPR'24), OA-CNNs (CVPR'24), MSC (CVPR'23)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Why still have KNN in code if we use grid pooling to replace FPS+KNN?

Stronger-Huang opened this issue · comments

Thank you for your great work about PTv2!

class BlockSequence(nn.Module):
...
def forward(self, points):
        coord, feat, offset = points 
        # reference index query of neighbourhood attention
        # for windows attention, modify reference index query method
        reference_index, _ = pointops.knn_query(self.neighbours, coord, offset) # self.neighbours=16 ,return idx, torch.sqrt(dist2)
        for block in self.blocks:
            points = block(points, reference_index)
        return points

Here in class BlockSequence, I find that the layer after grid pooling will be knn block. But paper PTv2 mentioned that FPS+KNN will be replaced by grid pooling and miou is better. So Why the code still adds KNN?

I would appreciate it very much if you can reply!

Because Neighborhood Attention still needs KNN to determine the kernel. This is not for the pooling layer.

so in fact this class 'GridPool' below can replace FPS + KNN?

`
class GridPool(nn.Module):

def __init__(self, in_channels, out_channels, grid_size, bias=False):
    super(GridPool, self).__init__()
    self.in_channels = in_channels
    self.out_channels = out_channels
    self.grid_size = grid_size

    self.fc = nn.Linear(in_channels, out_channels, bias=bias)
    self.norm = PointBatchNorm(out_channels)
    self.act = nn.ReLU(inplace=True)
def forward(self, points, start=None):
    coord, feat, offset = points
    batch = offset2batch(offset)
    feat = self.act(self.norm(self.fc(feat)))
    start = (
        segment_csr(
            coord,
            torch.cat([batch.new_zeros(1), torch.cumsum(batch.bincount(), dim=0)]),
            reduce="min",
        )
        if start is None
        else start
    )
    cluster = voxel_grid(
        pos=coord - start[batch], size=self.grid_size, batch=batch, start=0
    )
    unique, cluster, counts = torch.unique(
        cluster, sorted=True, return_inverse=True, return_counts=True
    )
    _, sorted_cluster_indices = torch.sort(cluster)
    idx_ptr = torch.cat([counts.new_zeros(1), torch.cumsum(counts, dim=0)])
    coord = segment_csr(coord[sorted_cluster_indices], idx_ptr, reduce="mean")
    feat = segment_csr(feat[sorted_cluster_indices], idx_ptr, reduce="max")
    batch = batch[idx_ptr[:-1]]
    offset = batch2offset(batch)
    return [coord, feat, offset], cluster`

And the knn in blocksequence is just for neightbour attention?
I would appreciate it very much if you can reply !!

Yes. And in our PTv3, KNN is fully removed from the pipeline.