huawei-noah / noah-research

Noah Research

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[TokenFusion] What is the meaning of x[0] and x[1] in TokenFusion segmentation?

CaffreyR opened this issue · comments

commented

Hi, many thanks for your work. When I try to reproduce your code. In your forward there are 4 stages, in each stage, you use this code

x, H, W = self.patch_embed4(x)
for i, blk in enumerate(self.block4):
    score = self.score_predictor[3](x)
    mask = [F.softmax(score_.reshape(B, -1, 2), dim=2)[:, :, 0] for score_ in score]  # mask_: [B, N]
    masks.append(mask)
    x = blk(x, H, W, mask)
x = self.norm4(x)
x = [x_.reshape(B, H, W, -1).permute(0, 3, 1, 2).contiguous() for x_ in x]
outs0.append(x[0])
outs1.append(x[1])

Do x[0] and x[1] refer to RGB and Depth input? Then when does tokenfusion take place?

Many thanks!